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PREFACE 


This book was conceived as a text combining the course of 
linear algebra and analytic geometry. It originated as a course 
of lectures delivered by N. V. Ellmov at Moscow State University 
(mechanics and mathematics department) in 1964-1966. However, 
the material of these lectures has been completely reworked and 
substantially expanded. We have tried to bear in mind the requi¬ 
rements of other mathematical disciplines and also of mechanics 
and physics. We hope that all parts of the text will be useful. The 
only preparation required for this text can be given an a first- 
semester course of analytic geometry and algebra at the most ele¬ 
mentary level. All that is needed is a firm grasp of the elements 
of these subjects. For Chapter XII the student should be ac¬ 
quainted with projective transformations and the projective pro¬ 
perties of figures in the plane. Also, in Chapter X the reader may 
simplify his task by skipping Subsections 13 to 23 (Section 3) and 
Subsection 10 of Section 7. What is left of Chapter X can serve as 
a minimal algebraic basis for the theory of multidimensional in¬ 
tegration. 

It may be noted in conclusion that the first five chapters already 
contain material with broad applications in mathematics, mecha¬ 
nics, and physics. These chapters, supplemented with some of the 
material of subsequent chapters, can be utilized in higher tech¬ 
nical schools with a more advanced mathematics curriculum. 


N. V. Efimov 
E- R- Rozendorn 



INTRODUCTION 


In mathematics and its applications, one often has to deal with 
certain sets of objects for which so-called linear operations have 
been defined; addition and multiplication by a number (scalar). 
For example, in mechanics we consider all kinds of forces applied 
to a given rigid body. Two forces applied to a single point may 
be added, that is to say, replaced by a single force applied to that 
point. The force may be multiplied by a scalar a, which means 
“increased a times” in the direction of action. In mechanics we 
also consider the composition of velocities and the multiplication 
of a velocity by a scalar, and the composition of accelerations and 
the multiplication of an acceleration by a scalar. Forces, velocities 
and accelerations differ as to their physical nature, but the linear 
operations performed on them are, from the geometrical point of 
view, of a single nature. It is for this reason that in mechanics 
we have a general unified mode of depicting these entities in the 
form of directed line segments. In this way, they are all handled 
by the general rules of addition and scalar multiplication of geo¬ 
metrical vectors. 

However, this generalization goes much farther. Consider, for 
example, the set of all functions continuous on the real number 
line, or the set of all periodic functions with a given period, or the 
set of all algebraic polynomials. It is quite natural in each of 
these sets to consider linear operations (understanding the sum of 
functions and the product of a function by a number as is usual 
in analysis). The objects we are now speaking of are not like for¬ 
ces, velocities or accelerations, or geometrical vectors. Too, the 
linear operations performed on tliem differ from the linear opera¬ 
tions performed on the vector quantities of mechanics or on geo¬ 
metrical vectors. 

However, there is something common to them all that permits 
studying linear operations abstractly, quite apart from the specific 
nature of the entities. 



INTRODUCTION 


I? 


l iislly, ill iiiiy one of our examples the linear operations carried 
dill on (he elements of a given set (that is, on the objects that 
make lip the set) yield elements of the same set. Namely, by 
iidding geometrical vectors or multiplying them by a scalar, we 
olii.'iin geometrical vectors; by adding continuous functions or 
nnilliplying them by a scalar, we get continuous functions. The 
same goes for periodic functions with a given period and for al¬ 
gebraic polynomials. 

What is more, linear operations that differ for different sets 
have certain common properties (which will be examined in the 
first chapter). The existence of common properties permits us to 
study linear operations as such. 

The study of sets with specified linear operations leads to the 
concept of a linear space. The theory of linear spaces finds very 
broad applications in modern mathematics and allied sciences. 

A linear space will be defined in the first section. It will not 
contain any description of the elements of the sets considered or 
of the linear operations performed. The only thing required will 
be certain properties of linear operations that are common to all 
particular cases. These requirements are expressed as the axioms 
of a linear space. It is worth mentioning that the requirements 
expressed in the axioms are very few and there remains the pos¬ 
sibility of adding new assumptions to them. Therefore, a certain 
classification appears in the general concept of a linear space so 
that, actually, we have to do not with a single linear space but 
with distinct classes of linear spaces, and the theory based on 
the axioms of a linear space becomes diversified. 

All linear spaces may be separated into finite-dimensional and 
infinite-dimensional spaces. Finite-dimensional spaces (one-dimen¬ 
sional, two-dimensional, three-dimensional, and so forth) are stu¬ 
died in linear algebra, which makes up the subject matter of this 
text. Infinite-dimensional spaces are considered in various parts of 
functional analysis. We will speak of them only occasionally to 
illiislrale certain general conclusions. 

An instance of a finite-dimensional space is the three-dimen¬ 
sional space of geometrical (free) vectors. This space contains 
within it an infinite number of two-dimensional and one-dimen- 
sionai s|)accs called subspaces (every two-dimensional subspace 
consists of vectors lying in one plane, and every one-dimensional 
siihs|)ace consists of vectors lying on a single straight line). Thus, 
for one-, two- and three-dimensional linear spaces we have geo- 
iiietiical models that are naturally associated with our pictorial 
conceptions of vectors. When passing to multidimensional spaces, 
the pictoi iiil nature of the entities is partially lost, but the theory 
of lliese spaces retains its geometrical character. The point is that 
its li.iNii- concepts are constructed by borrowing from the three- 
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dimensional case and appropriately generalizing to the multidi¬ 
mensional case. The retention of geometrical terminology also 
plays a part. For instance, when speaking of diverse sets, we call 
them spaces. Note too that the elements of any linear space are 
conventionally called vectors. And so linear spaces are also 
termed vector spaces. The geometrical nature of the terminology 
and of the basic concepts of linear algebra helps to make contact 
with geometry. We have in view here analytic geometry and, par¬ 
ticularly, multidimensional analytic geometry, that is, the multi¬ 
dimensional analogue of ordinary (three-dimensional) analytic 
geometry. What is more, linear algebra and analytic geometry are 
so closely connected that it is difficult to draw any hard and fast 
line between them. And we will not try to do that. We have al¬ 
ready stated that the subject matter of this text is linear algebra. 
With the same justification we can say that its subject is multidi¬ 
mensional analytic geometry. 
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LINEAR SPACES 


§ 1. Axioms of a linear space 

1. Suppose we have a set L consisting of any kind of elements. 
We denote these elements by the lower-case Latin letters a, b, 

x,y .However, in only one case we will use the lower-case 

Greek letter 0 in a similar instance. Together with the elements 
of the set L we will consider any numbers, real or complex, which 
will be denoted by lower-case Greek letters a, p, ... (with the 
exception of 0). 

2. We assume that the concept of equality of elements has been 
defined in the set L. This means that all elements of L have been 
distributed in some way into classes (subsets of L) so that distinct 
classes do not have any elements in common. Then two elements, 
a and b, are taken to be equal (a = b) if they belong to some one 
class. Every class can also consist of a single element, in which 
case the equality a = b means that a and b denote the same ele¬ 
ment of L. 

Later on we will sometimes have to do with what we call an 
admissible replacement of one element of L with another element, 
if the replacement is made within one class; that is to say, one 
element is replaced by any equal element. 

3. In some cases, instead of a specified partition of L into 
classes of equal elements, certain conditions of admissible repla¬ 
cements will be indicated (that is, conditions under which the ele¬ 
ments are taken to be equal). Then, for an arbitrary element a 
in L there will be defined a class consisting of all elements of L 
equal to the element a. However, in order to obtain the required 
partition of L into such classes, three circumstances must be en¬ 
sured. 

(1) The element a itself must belong to the class that is, 
the conditions of equality must be such that the element a is 
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(■(iiisidi'ii'd equal to itself: a — a (in other words, the replacement 
of an element by itself must be considered admissible). 

(2) U a = b, then it must be true that b — a. 

(li) If o = 6 and b = c, then it must be true that a = c. 

If and only if these three circumstances hold can any two ele¬ 
ments belonging to the class be equal. Besides, the class st- 
includes all elements of L that are equal to some one element of 
that class. 

The foregoing is illustrated by the examples of Section 2. 

4. We will say that in the set L are defined the operations of 
addition and multiplication by a scalar if: 

(1) to every two elements a, b ol L there is associated a certain 
element of L called their sum denoted by a + 6; 

(2) to every scalar a and every element a in L there is asso¬ 
ciated a certain element of L called the product of a by a or a 
by a; this product is denoted by aa or aa. 

It is assumed that the operations of addition and scalar multi¬ 
plication are invariant with respect to admissible replacements of. 
the elements of the set L: if a = a', b = b', then a-{■ b = a' b' 
and aa = aa'. 

Also, the following eight axioms are assumed to hold true: 

(1) for any a, b in L, 

a b = b -|- a 

This is the commutative property for addition; 

(2) for any a, b, c in L, 

{a + b) + c = ai-{b + c) 

This is the associative property. It permits writing a sum without 
recourse to brackets: abc = (a ■}-b)+c = a{bc). 
Also, by the first axiom, the order of the terms is immaterial; 

(3) the set L has an element 0 such that 

a + 0 = a 

for any a in L. 0 is called the zero element; 

(4) for every clement x in L there is an element y \n L such 


Idomenl // is called the inverse of x and is denoted — x\ 

(5) 1 • a = a; 

(G) a(p.'() = {up)a; 

(7) (a -f P) a = aa + Pa; 

(8) u (a I b) - - aa -j- ab. 

In Hie last four a.xioins, a and b denote arbitrary elements of L; 
a and p are ailiilrary scalars. 
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Note that the property expressed by the seventh axiom is called 
the distributive property for a factor taken from L (this axiom 
permits distributing a factor from L over the components of a nu¬ 
merical factor). The eighth axiom expresses the distributive pro¬ 
perty for a numerical factor. 

5. Basic definition. The set L, together with the operations spe¬ 
cified in it of addition and multiplication by a scalar, is called a 
linear space. 

We stress the fact that it is assumed in this definition that addi¬ 
tion and multiplication satisfy all the properties enumerated in 
Subsection 4. 

The eight axioms of Subsection 4 are called the axioms of a li¬ 
near space. 

As has already been mentioned in the introduction, the elements 
of a linear space are also called vectors, and so a linear space is 
likewise known as a vector space. Very often we will call the 
set L a space without using any modifying adjective, and it will 
be assumed to be a vector space. 

6. If we have a space L and multiplication of vectors solely by 
real numbers is defined, then L is termed a real vector space. If 
multiplication of the vectors of L is also defined for complex num¬ 
bers, then the space L is termed a complex vector space. 

In the future, the term “arbitrary scalar” will mean any real 
number if we are speaking of a real space and any complex 
number if we are dealing with a complex space. 

A substantial portion of the facts staled in the first few chapters 
of this book refer both to real and complex spaces. If a certain 
property holds true only for a real or only for a complex space, 
that will be specially pointed out. 

7. Occasionally, instead of multiplication by a real or a complex 
number we consider multiplication of the elements of L by ele¬ 
ments of an arbitrary algebraic field U (all the eight axioms of 
a linear space must hold true, of course). In this case, the set 
together with the specified linear operations is termed a linear (or 
vector) space over the field U. 

§ 2. Examples of linear spaces 

Preliminary remark. If, relative to any specific set equipped 
with linear operations, it is asserted that the set is a linear space, 
then to prove that assertion it is necessary to verify that the spe¬ 
cified operations are indeed linear, that is to say, that they satisfy 
the requirements of the eight axioms of a linear space. 
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1. The space of geometric vectors. Consider the set of all geo¬ 
metric vectors in three-dimensional Euclidean space. Note that 
two elements of this set, that is, two vectors, are considered equal 
if and only if they are collinear, have equal lengths and are in 
the same direction. We thus have in mind free vectors whose point 
of application may be chosen arbitrarily. 

Admissible replacements of a vector consist in parallel transla¬ 
tions to new points of application. Clearly, the three conditions of 
Subsection 3 of Section 1 hold true. Linear operations on geometric 
vectors are carried out in the familiar way: addition by means of 
the parallelogram rule, multiplication by a real number a repre¬ 
sents an a-fold stretching of the vector. Both operations are in¬ 
variant under admissible replacements. Indeed, if a = a', b = b', 
then the parallelogram constructed on the vectors a', b' is obtained 
by a parallel translation of the parallelogram constructed on a, b. 
Thus, the vector a' -f b' is obtained by a parallel translation of 
the vector a + b, that is, a -f ft = a' + b'. It is quite apparent that 
the equation aa = aa' also holds true. 

Geometric vectors with the indicated definition of linear opera¬ 
tions form a real linear space. The zero element here is a vector 
of zero length. If x is any vector, then the inverse is y = — x, 
which is a vector of the same length but in the opposite direction. 
The requirements of the axioms (l)-(8), Subsection 4, Section 1, 
hold true. This is evident from simple geometrical reasoning and 
is of course in no way accidental. The point is that geometric 
vectors served as the original model for the general concept of a 
linear space. In other words, the axioms (l)-(8) express certain 
properties of linear operations on geometric vectors that are quite 
familiar from elementary vector algebra. 

One might readily ask why, in axioms (l)-(8), there are not 
included certain simple and important properties of geometric 
vectors that are constantly utilized in vector computations. For 
instance, that the multiplication of a vector by the scalar zero 
yields a zero vector or that in the multiplication of any vector x 
by the scalar —1 we get the opposite vector — x. It turns out that 
this is not necessary since such properties may be proved, that is, 
derived from the axioms, which is what will be done in Section 3. 

2. Zero .space. Let L be a set consisting of only one element. 
What (hat element is is immaterial. Let us denote it by the letter 0. 
We now define linear operations in L, assuming that 6 added to 
it.self yields 0 and that when it is multiplied by a real number it 
also yields 0. It is easy to see that the axioms (l)-(8) hold true. 
Hence, (he given set L is a real linear space consisting obviously 
of Ihe sole zero element. It is just as easy to define the set L as a 
complex linear space. 
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i 2] 

Remark. All other (real or complex) linear spaces of necessity 
have an infinitude of elements, for in Subsection 2 of Section 3 it 
is shown that if a linear space contains at least one element a 
different from the zero element, then for distinct scalars a and p 
the elements aa and pa are also distinct. 

3. Coordinate space. Now let L denote a set whose elements 
consist of all possible ordered n-tuples of real numbers (« a fixed 
natural number). An ordered n-tuple is one in which the consti¬ 
tuents have been numbered. (They need not necessarily be 
distinct.) When we say that an element x in L is an n-tuple of 
numbers Xi,X 2 , ..., we will write x = [xi,X 2 , ..., x„). Assum¬ 
ing X to be arbitrary, let us consider another (also arbitrary) ele¬ 
ment y — [y\, 1 / 2 , ..., yn)- We will assume that the elements x 
and y are equal if and only if Xi = yu X 2 = y 2 , ■ ■ ■, x„ = y„. We 
define linear operations in L by the relations 

x + y = {xt + yu X 2 + yo, • • •, x„ + y„}, (1) 

ax = (cu:,, 0 X 2 , • • •. (2) 

Then the requirements of the first two axioms of a linear space 
hold true since the addition of real numbers is commutative and 
associative. To verify axioms three and four, it suffices to indicate 
a zero element in L, namely, 

0 = {0, 0, ..., 0} (3) 

It is also clear that for any x in L there is an inverse element — x, 
namely, 

—->c= {—->Ci, — JC2, ..., — (4) 

Axiom (5) is immediately apparent from relation (2). Finally, 
axioms (6), (7), and (8) hold true because of (1), (2) and also 
due to the fact that the multiplication of real numbers is associa¬ 
tive and distributive. 

To summarize, then, the set L with specified linear operations is 
a real linear space. We will call it a real coordinate space Kn- 

Remark. The present set L under consideration does not permit 
us to regard the factor a in (2) as a complex number because a 
complex a in the right member of (2) would yield a set of complex 
numbers that is not an element of L. 

4. This time let us denote by L the set of all ordered n-tuples of 
complex numbers. 

We define the linear operations by (1) and (2), now assuming 
that a, Xj, yj (/= 1, ..., n) are complex numbers. As in the 
preceding subsection, all eight axioms (l)-(8) hold true, and the 
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zero and inverse elements are expressed by the formulas (3) and 
(4). Tlius, L is a linear space; it is a complex linear space since 
the scalars a are complex numbers. We will call it a complex coor¬ 
dinate space K,t. 

Remark. However, there is nothing to stop us, in (i) and (2), 
from using only real numbers for a while Xj, y, remain complex. 
Then the set L will be a real linear space. It is clear from this that 
the same objects (for instance, ordered sets of complex numbers) 
can serve as vectors of distinct linear spaces. For this reason, in 
the general definition of Section 1, a linear space is not merely a 
set L but a set together with linear operations specified in it; and 
it is also necessary to indicate the field from which the factors a 
are taken. 

5. The space of matrices. According to accepted practice, we 
will say that a rectangular matrix, more precisely, an my, n 
matrix, is an array of numbers arranged in m rows with n num¬ 
bers in each row. If the numbers making up the matrix are de¬ 
noted by a,ft (t = 1 , 2 .m; k = I, 2, ..., n) and the matrix 

itself by a, then we write 



«ii 

0,2 . 

• ^]n 

a = 

a.,i 

022 • 

• ^2n 



Om2 • 



In this notation, the given numbers are also arranged in columns 
(the number 0 , 7 , lies in the ith row and the *th column). Besides 
this expanded notation we will also make use of an abbreviated 
notation: 

a = \\aa II 

Let us agree to call a matrix a real matrix when it is composed of 
real numbers and a complex matrix when the elements (entries) 
arc complex numbers. 

Let L bo the set of all my n matrices, for example, real 
matrices. Two matrices will be considered equal elements of the 
set L if and only if corresponding positions in the matrices are 
occupied by the same numbers (that is to say, one and the same 
number lying in both matrices at the intersection of the /th row 
and /fth column). Let us equip the set L with linear operations, 
namely, if a = ||«,;,||, h = 116,7,11 are arbitrary matrices in L and a 
is an arbitrary real number, then we set 

(i + b = \\aik + bikl aa = ||aa,ft|| 


(5) 
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In other words, when adding the m Xmatrices a and b, we add 
pairwise the identically located numbers a,/, and ba. To multiply 
matrix a by a scalar a, we multiply by a all the numbers that 
make up matrix a. Just as in Subsection 3, we can establish that 
the linear operations (5) satisfy the axioms {l)-(8). Here the role 
of the zero element in L is played by the matrix 0, which consists 
entirely of zeros (the zero matrix). For the inverse element we 
have the matrix ||—a,fcll, which is the inverse of a = ||a,fc||. Thus, 
L together with the linear operations (5) is a real linear space. 
Similarly, the set of all complex my^n matrices with linear ope¬ 
rations (5), where a is a complex number, constitutes a complex 
linear space. Naturally, when we consider complex m X « mat¬ 
rices, we can regard a to be real. Then we get a real linear space 
of the same complex matrices. 

Remark. In the particular case of m = \ (for a given n), we 
obtain matrices each of which has only one row (consisting of n 
numbers). The linear space of such matrices is nothing other than 
the coordinate space Kn (see Subsection 3). For n = 1 and given 
m we get matrices with only one column. Clearly, they too repre¬ 
sent a coordinate space, namely Km- What is more, the space of 
arbitrary m X « matrices may be regarded as a coordinate space 
Kmn, since we can readily establish for all elements of the matrices 
a general numbering according to some standard system and then 
write them out in one row or one column. 

6. The space of continuous functions. Let us take, on the real 
line, an arbitrary interval ti ^ t ^ T 2 and denote by L the set of 
all functions that are continuous on this interval and that assume 
real values. Bearing in mind that element x of Z. is a certain con¬ 
tinuous function x{r), Ti t tj, we will write x = {a:(t)}. Re¬ 
garding X as arbitrary, let us consider another, also arbitrary, ele¬ 
ment y = {y{x)}. The elements x and y will be regarded as equal 
if and only if x(t) = ^(t), that is, when x(t) and y{x) coincide at 
any point t of the interval Ti ^ t ^ T 2 . We define linear ope¬ 
rations in L by setting 

x-\-y = {x(x)-\-y (t)}, ax = {ax (t)} (6) 

where a is a real number. In other words, we add functions and 
multiply them by scalars in the usual way as accepted in analysis. 
It is essential to point out that adding continuous functions and 
multiplying a continuous function by a constant yield continuous 
functions. It is easy to see that the linear operations (6) satisfy 
the axioms (l)-(8). Here, the zero element 0 is a function equal to 
zero at all points t of the interval [tj, T 2 ]. The inverse of an ele¬ 
ment x={x(t)} is {—x(t)}. Thus, the set of all real functions 
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conliniious on ti ^ x ^ T 2 together with the linear operations (6) 
is a real linear space. 

If for L we take the set of all functions continuous on xi ^ x ^ 
^ X 2 and having complex values, that is, functions of the form 
x(x)= «(x)+ then in this set we can specify the linear ope¬ 

rations (6) for a complex a. All axioms (l)-(8) are again satisfied 
and we obtain a complex linear space of continuous functions with 
complex values. Here too, like in the examples examined in Subsec¬ 
tions 4 and 5, we can make the set of continuous functions 

X (x) = «(x) + iv (x) 

a real linear space if in (6) we admit only real numbers for a. 

7. The space of integrable functions.* We consider all real¬ 
valued functions integrable on the interval Xi ^ x ^ X 2 and de¬ 
note the set of these functions by L. 

It will be recalled that if we change an integrable function at 
one point in any way whatsoever (retaining the remaining values) 
then the function will remain integrable, and the integral of the 
function will be equal to the same number as prior to the change. 
The same goes if the function is changed at several points, even 
at an infinitude of points, provided that this set is of measure zero. 
From the viewpoint of integration theory, such changes in the 
function are not essential. For this reason, in questions of integra¬ 
tion theory it is not desirable to distinguish between two functions 
if they coincide on the interval Xi ^ x ^ X 2 almost everywhere, 
that is, at all points of Xi ^ x ^ X 2 except possibly for a set of 
measure zero. 

In this connection, we agree to consider as equal two elements 
x = {x(x)}, «/ = {t/(x)} of the set L is x{x) = y{x) almost every¬ 
where on the interval xi ^ x ^ X 2 . Accordingly, an admissible re¬ 
placement of an arbitrary element x— {x(x)} sL consists in any 
change in the values of the function x(x) on any set of measure 
zero. 

It will readily be seen that this definition of equality of elements 
of L satisfies the three requirements of Subsection 3, Section 1. 
For the first two it is obvious. Now let y = x, that is, y{x) = x{x) 
everywhere except for a certain set of measure zero. Let z = x, 
that is, ^(x) = x(x) everywhere except for a certain set of 
measure zero. Then (/(x)= z(x) everywhere except possibly for the 
union of the sets ./H\ and Jl^- But the union of two sets of measure 
zero is a set of measure zero. Consequently, ^(x) = z(x) almost 


* This suhsc(Tii)ii may lie skipped if the reader is not familiar with the 
tlicuiy of iiite|{iali(>n. 



$ 31 COROLI.ARIES TO AXIOMS OF LINEAR SPACE 23 

everywhere and, hence, y — z. Thus, the third requirement is satis¬ 
fied: \\ y = X, z = X, then y = z. 

If in the set L we define the linear operations according to the 
formulas (6) of Subsection 6, then invariance of the linear opera¬ 
tions relative to admissible replacements will be ensured and all 
axioms (l)-(8 ) will be seen to hold true. We will not dwell on the 
proof of these circumstances and will merely note that in the given 
case the zero element is 0 = {0 (t)} where 0 (t) is any function 

equal to zero almost everywhere on the interval [ti, T 2 ]. 

The set L together with the specified linear operations is termed 
the space of functions integrable on the interval [ti, T 2 ]. 

8. Counterexample. Denote by L the set of all ordered n-tuples 
of real numbers (n > 1), that is, a set of the same kind as in 
Subsection 3. Let us define the sum of two elements of L in the 
same way as was done in Subsection 3: 

x + y = {Xi + y,, X 2 + y 2 , .... + //„) (7) 

Let multiplication of x: by a be given by the rule 

cur = { 0 x 1 , X 2 , ..., (8) 

(on the right side, only X\ is multiplied by a). The axioms (l)-(4) 
are satisfied by (7), and 

0 = {O, 0, ..., 0), —x = [—x^, —X 2 , ..., — x„} 

It is also easy to verify that the requirements of axioms (5), (6), 
(8) are satisfied, yet axiom (7) does not hold: 

(a -f P) X = {(a -f P) X,, X 2 .x„}, 

ox -f px = {(a -f P) X|, 2x2, ..., 2x„} 

Thus, the set L with operations (7), (8) is not a linear space. 

§ 3. Elementary corollaries to the axioms of a linear space 

1. Let us now examine the general theory, that is, the conclu¬ 
sions that follow from axioms (l)-(8 ) irrespective of the particu¬ 
larities of specific linear spaces. The following propositions hold 
(rue. 

(1) In every linear space there is only one zero vector. 

Proof. Suppose the elements 0i and 02 are zero elements. By 
axioms (1) and (3) they coincide: 

02 = 02 -f- 0| = 0, -f- 02 = 01 

Remark. When we say that (here is only one zero vector, wo 
mean that we do not distinguish between equal vectors. Unique- 
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iK'ss is (o be understood in the same way in the other theorems as 
well, for instance, in the following proposition. 

(2) For any vector x there is only one opposite vector. 

Proof. Suppose that x + ^i = 0 and that x + i /2 = 6. Axioms 

(l)-(4) permit writing down the following chain of equations: 

^2 = 1/2 + 0 = '/2 + (•»: + t/i) = iy-z + + t/l 

= U + yz) + t/i = e + = </i 

which means y 2 = y\. 

(3) The product of any vector x by the number 0 is equal to the 
zero vector 0. 

Proof. For a given vector x take the opposite vector y. Using 
axioms (2)-(5) and (7), we get 

O-jf = O‘jr+0 = O‘Jc + (jc + i/) = (O+l)A: + «/ = ^ + ^ = 0 

(4) The product of any vector x by the number — 1 is equal to 
the vector opposite to x, that is, (— l)x = — x. 

Proof. It is required to establish that x-f(—l)jc = 0. From the 
preceding property and from axioms (5) and (7) we have 

x + (—1 )a: = (1 - 1)A:=O-Ar = 0 


(5) The product of the zero vector 0 by any scalar a is equal to 
the zero vector. 

Proof. Take an arbitrary vector x. Using axiom ( 6 ) and pro¬ 
perty (3), we find 

a0 = a (0 • x) = (a • 0) X = 0 • a; = 0 


2. Remarks. (1) From property (5) it follows that the product 
of a nonzero vector by a nonzero scalar always yields a nonzero 
vector. Indeed, if for A, # 0, a # 0 it were true that Xa = 0 , then 
because of property (5) and axioms (5) and ( 6 ), we would have 

«=!•«= a = (Xa) —-^0 = 0 

which contradicts the condition a 0 . 

(2) If a P and a = 7 ^= 0, then aa ^ pa. Indeed, if it turned out 
that aa = pa then it would be true that aa-f-(—p)a = 0 , or 
(a — p)o — 0 , which runs counter to the foregoing, since a—p^O 
and a ^ 0. 

3. The operation of subtraction is defined in a linear space. Na¬ 
mely, a vector x is called the difference between a vector b and a 
vector o if X -f a = b and we write x = b — a. 

We will prove lhal for arbitrary elements a and b a difference 
exists and it is unique. 
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Existence. We prove that the vector x = &+(—l)a is the 
difference b — a. By axioms (2), (3), (5), (7) and property (3) we 
have 

X o ^ b (— 1) a a = b ( — 1 -|- 1) o = b 0 • a = b 

Uniqueness. We show that if x is the difference b — a, then it 
can always be represented in the form x = b-f(—l)a. Indeed, 
from the equation xa = b we get, with the aid of axioms (2), 
(3), (5), (7) and property (3), 

Af = .v4-0 = .* + (l — I)a = JC + a + (— l)« = 6 + (— l)a 

4. In the sequel, we will make use of the axioms of linear space 
and of the properties established in this section without detailed 
explanations. Due to the axioms and the results obtained here, 
computations involving elements of a linear space are carried out 
in a manner similar to the manipulations of elementary algebra, 
with the sole difference that there is no multiplication and division 
of vectors and one must distinguish between the number zero and 
the zero vector. 

In particular, we can transpose a vector from one member of 
a vector equation to the other by multiplying that vector by minus 
one (or, what is the same thing, by replacing it by the opposite 
vector). 


§ 4. Linear combinations. Linear dependence 

1. Given a finite number of elements of a linear space: 
a, b, c, ..., q. Also, let a, p, y. • • •. x be arbitrary scalars. 

Definition 1. Any element x of the space L that can be repre¬ 
sented as 

x = aa -f pi> -f + • • • + ^<9 

is called a linear combination of the elements a,b,c, .... q. We 

also say that x is expressed linearly in terms of a, 6, c. q. 

Definition 2. A linear combination is termed trivial if a = p = 
= Y = • • • = >« = 0 3nd nontrivial if there is at least one nonzero 
scalar among the scalars a, p, ..., x. 

Definition 3. A system (set) of vectors a, b,c, .... q is said to 
be linearly dependent if there is a nontrivial linear combination of 
vectors a, b, c, ..., q equal to the zero vector, in other words, if 
it is true that 

aa -j- p6 4- YC + • • • -}- xfl = B 

where there is at least one nonzero scalar among the scalars 

a, P, Y.X- 
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Definition 4. A system of vectors a,b,c, ..., q is said to be 
linearly independent if the equation 

aa + + yc + ... + = 0 

is possible only if 

a = p = Y= ...=x = 0 

2. Let us consider the properties of the foregoing concepts. 

(1) It follows directly from the definitions that any finite system 
of vectors is either linearly dependent or linearly independent. We 
will show that a system consisting of one vector is linearly depen¬ 
dent if and only if this vector is a zero vector. 

Indeed, the equation aO = 0 for any a, in particular for a ^ 0, 
was established in Subsection 1 of Section 3. Now let = 7 ^= 0 and 
ax — 0. Then a = 0 in accordance with Subsection 2 of Section 3. 

(2) If part of a system is linearly dependent, then the whole 
system is linearly dependent. 

Suppose it is known that a part of the system a,b,c, ..., q — 
consisting of the vectors c, .... q — is linearly dependent. This 
means that there exist scalars y, ..., y, not all equal to zero and 
such that + • • • + = 0- But then the linear combination 

O-a + 0-6 -f Y^^ + • • • + ><9 = 0 nontrivial since there are non¬ 
zero scalars among y> • • •. x. 

(3) If an entire system of vectors is linearly independent, then 
so also is any part of that system. 

This follows directly from the preceding property. In particular, 
the zero vector cannot enter into a linearly independent system. 

(4) If a system is linearly dependent, then there will be at least 
one vector in it that will be expressed linearly in terms of the re¬ 
maining vectors. 

Indeed, if aa-f -j- y<^ + • • -t-xc? = 0 and there are nonzero 
coefficients among a, p, ..., x, then any one of the vectors having 
nonzero coefficients may be expressed linearly in terms of the 
remaining vectors of the system. For instance, if a ^ 0, then 



By property (4) this is not only necessary but also sufficient for 
tlie linear dependence of the system of vectors; namely, the follow¬ 
ing assertion holds true. 

(5) If some clement of a system is a linear combination of the 
remaining elements, then the system is linearly dependent. 

Indeed, if 

rt == P'6 -f Y'c -f .., -f y'q 

1 •« + (-r)^' + (-Y')c+ ... -f(-x')9 = 0 


then 
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and the linear combination in the right member of the last equa¬ 
tion is nontrivial. 

(6) Let fli.Oft be certain vectors. Suppose each one of the 

vectors cu c^ .c„ is expressed linearly in terms of a^ , ah- 

c, =a„ai +a,ia^, 

C 2 = 02101 + . . . + 02 * 0 *, 


Cn — “rtl^l + ••• +O„*0* 

Furthermore, let vector b be linearly expressed in terms of 

^ 1 , • • • > ^1, • • • , 

6==A,iOi+ ... + ^* 0 * 4" + ••• + 

Then vector b can be linearly expressed in terms of the vectors 

• • • » 

Proof. 

6 = (A-i-f HiOii-1- ... -f(i„a„i) 0 | + 

• • • + (^* + HiOi* -|- ... -f n„a„*) 0 * 


§ 5. Lemma on the basis minor 

1. Suppose we have a rectangular matrix A = Ho.-J. We will 
regard the rows of the matrix as vectors of the coordinate space 
Kn, and the columns as vectors of the coordinate space Km (see 
Section 2, Subsections 3 to 5). Then we can speak about the linear 
dependence or independence of the rows of the matrix or about the 
linear dependence or independence of the columns. 

2. Let us mark k distinct rows and k distinct columns of the 
matrix A {k ^ n, k ^ m). The elements * of matrix A lying at 
the intersections of the marked rows and columns form a certain 
(clearly, square) matrix B. The determinant of matrix B is called 
a minor of order k of the given matrix A. 

Now mark, if possible, one more row and one more column of A 
without repeating the ones already marked. Now, all marked rows 
and columns intersect to form a certain square matrix C. 

The determinant of the matrix C is a minor of order + 1 of the 
matrix A. With respect to the original minor (that is, the deter¬ 
minant of matrix B), it is a bordering minor. 


• The elements of a matrix are the numbers that compose it: an, a^. 

However, it would be more exact to say that the elements of a matrix are the 

symbols On, O12.Here, two elements «,,, and an are taken to be distinct 

if i i or k^l (without precluding the possibility that ait, and aji denote 
the same number). Also note that in a number of cases, matrices are conside¬ 
red in which a.k are not numbers but other entities, for example, functions. 
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Remarks. (1) If /<=« or fe = m, then there are no bordering 
minors for minors of order k. 

(2) If k=\, the matrix B consists of a single element of 
matrix A. The minors of the first order are the numerical values 
of the elements of the matrix. 

3. Definition 1. A minor of a matrix is called a basis minor if 
it is not equal to zero and the bordering minors are either equal 
to zero or are absent altogether. 

Definition 2. The columns of the matrix intersecting the basis 
minor are called basis columns. The terminology is similar for 
rows, in which case we have basis rows. 

Remark. A matrix can have several basis minors and, accor¬ 
dingly, several systems of basis columns. Every matrix, except 
the zero matrix, has at least one basis minor and thus at least one 
system of basis columns. 

4. Lemma on the basis minor. The columns of a matrix that in¬ 
tersect the basis minor are linearly independent. Every column can 
be linearly expressed in terms of them. 

By definition 2, this lemma can be expressed thus: 

The basis columns are linearly independent. Any column of a 
matrix can be expressed linearly in terms of the basis columns. 

Proof. The proof of the first assertion is by reductio ad absur- 
dum. Suppose lliat the basis columns are linearly dependent. Then 
the columns of the basis minor are also linearly dependent, but 
then the basis minor is equal to zero, which runs counter to the 
definition. 

Proof of the second assertion. For the sake of definiteness, we 
assume that the basis minor is of order r and occupies the upper 
left-hand corner of the matrix: 


^11 • • • 

«lr 

. . . aik ... 

flirt 

Orl ••• 


• • • ^rk • • • 

^rn 

(7ml • • 

f/mr • • • ^mk • • • 

^mn 


Denote this basis minor by D. 

Take arbitrary indices i, k (1 < t < m, 1 ^ ^ n) and form 

a determinant of order r -f- 1: 
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We will prove that Ajft = 0. Let us consider three possible cases: 

(1) I < r. In this case. A,/, = 0 since here the last row coin¬ 
cides with one of the preceding rows. 

( 2 ) ^ r. In this case, A,), = 0 since here the last column coin¬ 
cides with one of the preceding ones. 

(3) i > r, k > r. In this case, the determinant A, 7 , is a border¬ 
ing determinant with respect to the minor D and is equal to zero 
because D is a basis minor. 

Fix k and assume that i runs over all possible values from 
I to m. 

Let us expand A,* by the elements of the last row. Denote the 
cofactors of the elements of the last row by AA2, ..., Ar+\. As i 
varies, these quantities remain unchanged since the cofactor of an 
element depends only on the position it occupies in the determi¬ 
nant but does not depend on the numerical value of the element 
itself. The expansion yields 

A/A = A|a/|+ ••• ArttirAr Ar^\aik = 0 (1) 

here 

= ( 2 ) 

The relations (I) and (2) yield 

= + ••• 

Recall that k is fixed and ranges over all values from 1 to m, 
therefore 


Olk 

1 

an 


a|r 


« 

• 

II 

T 

ami 

+ ... +(--^) 

^ntr 

(3) 


Formula (3) represents the A:th column (which may be taken 
arbitrarily) of the matrix in the form of a linear combination of 
the basis columns. This completes the proof of the lemma. 

Remark. A similar lemma naturally holds true for basis rows as 
well. 

5. As a corollary to the lemma on the basis minor, we have the 
following theorem. 

Theorem. The determinant of a square matrix is equal to zero 
if and only if there is a linear dependence between the columns of 
the matrix. A similar assertion holds true for rows. 

Proof. If the columns of an « X n matrix are dependent, then its 
determinant is equal to zero. This is one of the basic properties of 
determinants. We will show that if the columns are independent, 
then the determinant is not equal to zero. Indeed, if the columns 
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are independent and the determinant is equal to 0, then there must 
be a basis minor M of order less than n. But then there is a co¬ 
lumn that does not enter into the system of basis columns (the 
system corresponding to the minor M) and that can be linearly 
expressed in terms of the system, that is, there is a dependence 
between the columns. But this contradicts the assumption. 

§ 6. Basic lemma on two systems of vectors 

1. Let there be given two systems of vectors Oi, a?, a/, and 

6 i, bi, ..., bm in one and the same linear space. 

Lemma I (basic). If the system bu . is linearly inde¬ 

pendent, and each of the vectors bi is linearly expressible in terms 
of the system Oi, 02 , ..., a-r, then m ^ k. 

Proof (by contradiction). Suppose that m> k. Write down the 
formulas that express the vectors bi in terms of the vectors ay. 


bi 

— ®ll^l 

4- 

• • • + OifcOft. 

bi 

= 02101 

+ 

• • • + 

bm- 

1 

i«i + 

• * • f 

bm 

= a„„a, 

+ 

• • • + 


and consider the matrix l|ai;||. If the matrix ||a,;|| is a zero matrix, 
then b| = ... = b,„ = 0 and the system 61 , ..., bm is linearly de¬ 
pendent, which contradicts the hypothesis. Suppose the matrix 
||a,j|| is nonzero. Then it has a basis minor and the order of the 
basis minor does not exceed the number of columns k. The number 
of rows in the matrix ||a,j|| is m and is greater than k and, conse¬ 
quently, is greater than the number of basis rows in the matrix. 

Thus, the matrix llaijil has a certain system of basis rows and 
also at least one row that does not enter into the system. Accord¬ 
ing to the basis-minor lemma, the indicated row can be linearly 
expressed in terms of the basis rows. But then this means that 
lliere is a linear dependence between the rows of the matrix (see 
Section 4, Sul)section 2, Items (5) and (2)). We write it in the 
form 

A.| .... U|*} •••“}■ {®mli • • •» O/ni} ~ { 0 > • • • > 0 } ( 2 ) 

where there are nonzero scalars among Xi, ..., km- 
Multiply equations (1) by A,i, ..., km respectively and add them 
termwise. Taking into account the linear dependence (2), we find 

-j- ... -f = ^*1 (^l“ll + ••• + ^mOml) + 

... h (^l«l<! + ••• + = 0 • + ••• -f- 0 -a((( = 9 
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The system b] .6,„ proved to be linearly dependent, but this 

is impossible by the hypothesis of the lemma. The resulting con¬ 
tradiction demonstrates the proof of Lemma 1. 

2. We say that the vectors form a linearly inde¬ 

pendent subsystem in the system Ui, a* (ft ^ r) if the vectors 

.... are linearly independent and enter into the system 

a,, ..., a,,. 

Clearly, the system a\, ..., au contains at least one linearly in¬ 
dependent subsystem if and only if there is at least one nonzero 
vector among the vectors ai, ..., Ui,. 

3. Lemma 2. Suppose the system of vectors oi, ..., flr, Qr+i is 

linearly dependent and its subsystem a\, a,- is linearly indepen¬ 

dent. Then the vector Ur+i can be expressed linearly in terms of the 
vectors Oi, ..., a,. 

Proof. We have the dependence 

^1^1+ ••• + + V+l^r-FI = 0 (3) 

where among the scalars . Kr, there are some different 

from zero. It is clear that Xr+i cannot be equal to zero since in that 
case the subsystem Oi, ..., Or would be a dependent subsystem. 
Thus, Ar+i ¥= 0, and from (3) we get 

= ••• 

which is what we set out to prove. 

4. Definition. Suppose a system Oi, ..., an contains a linearly 
Independent subsystem consisting of r vectors. The number r is 
termed the rank of the system a\, ..., an if any subsystem of a lar¬ 
ger number of vectors is linearly dependent, or if there are no 
such subsystems (when r = ft). 

Briefly, the rank of a system is the maximum number of its 
linearly independent vectors. 

If all vectors of the system a\, ..., a^ are zero vectors, we say 
the rank of the system is zero. 

5. Lemma 3 (generalized basic). Suppose each of the vectors 
b], .. ., b,„ is expressed linearly in terms of the vectors a\, ..., au- 

Then the rank of the system bi .b„, does not exceed the rank 

of the system Oj, ..., a^. 

Proof. Denote by r the rank of the system fli, ..., a^. If r = 0, 
then the truth of the assertion of Lemma 3 is obvious. If r = k, 
the truth of the assertion of Lemma 3 follows from Lemma 1. In- 
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deed, (he rank of the system b . . 6 ,„, by Lemma 1, does not 

exceed k = r. 

Let 0 < r < ^. Then the system Oi, ..., a/, will have r linearly 
independent vectors. Suppose they are the vectors ai, ..., Ur. By 
adjoining one more vector from the system ai, ..., Ok, we will each 
time obtain linearly dependent systems, namely, 

Clif . * ., Cl ft +1 > 
fl|, • • •, Oft Clf^2t 


, • • •, Cl ft cifi 

By Lemma 2, each of the vectors a,+i, ..., a/, can be linearly 
expressed in terms of the vectors O], .... Or. On the other hand, 
the lemma states that each vector bi, ..., b,n is expressed linearly 
in terms of all the vectors Oi, ..., au- From this fact and from the 
preceding conclusion it follows that each vector 6i, ..., bm is li¬ 
nearly expressible in terms of Oi, ..., Ur (see Section 4, Subsec¬ 
tion 2, Item (6)). But then, by Lemma 1, the number of vectors in 
any linearly independent subsystem of the system bu ..., bm does 
not exceed r. The proof of Lemma 3 is complete. 


§ 7. The rank of a matrix 

1. Definition. The rank of a matrix is the maximum number of 
its linearly independent columns. 

In other words, the rank of a matrix is the rank of the system 
of its columns regarded as vectors of a coordinate space. 

For the rank of a matrix A we will use the symbol “rank A". 

If matrix /4 is a zero matrix, then rank /I = 0 since a zero 
matrix has no linearly independent columns. Note that the rank of 
a nonzero matrix is always positive. 

2. Theorem on the rank of a matrix. The rank of an arbitrary 
matrix is equal to the maximum order of its nonzero minors. 

Proof. If rank /I = 0, then is a zero matrix and it does not 
have any nonzero minors. In this case, it is natural to consider 
that the maximum order of the nonzero minors is equal to 
zero. 

Now suppose the matrix A is nonzero. If one of its minors M of 
order r is not equal to zero, and all higher-order minors are zero 
or absent, then M is the basis minor. By the basis-minor lemma, 
the columns of matrix A intersecting minor Af are linearly inde¬ 
pendent. Therefore, rank A ^ r. By the same lemma, any column 
of nnitiix A can be expressed linearly in terms of the basis co- 
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luinns, whence, using Lemma 3 of Section 6, we find rank A ^ r. 
And so rank A = r, thus completing the proof. 

3. A number of important corollaries follow from the reasoning 
carried out in the preceding subsection. 

(1) The rank of a nonzero matrix is equal to the order of any 
one of its basis minors. 

Indeed, if M is an arbitrary basis minor and r is its order, then, 
repeating the foregoing arguments, we see that rank A = r. 

(2) All basis minors of a nonzero matrix have the same order, 
which is equal to the rank of the matrix. 

(3) If in matrix A the minor M is a basis minor, then all minors 
of higher order (and not only the minors bordering M) are equal 
to zero. 

(4) The maximum number of linearly independent rows of 
an arbitrary matrix A is equal to the maximum number of 
its linearly independent columns (that is, it is equal to the 
rank of A). 

Proof. If A is a zero matrix, the number of linearly independent 
rows and the number of linearly independent columns is zero. 
Let A be a nonzero matrix. Take the transpose of A. The rows will 
then become the columns of the transposed matrix A*, the linearly 
independent rows become the linearly independent columns of A*, 
and the maximum order of nonzero minors is preserved because, 
in a transposition, each of the minors preserves its numerical va¬ 
lue. Thus 

rank A = rank A* 

and is equal to the maximum number of linearly independent rows 
of matrix A. 

(5) If A is an arbitrary my, n matrix, then rank A does not 
exceed the smaller of the two numbers m and n. 

4. It is clear from the foregoing that the rank of a matrix does 
not change in an interchange of its columns or rows. 

Besides, from the lemmas of Section 6 it follows that the rank 
of a matrix does not change if to one of the columns we add a li¬ 
near combination of the other columns. 

Similarly, the rank is preserved if to one of the rows we add a 
linear combination of the other rows. 

The properties enumerated in this subsection are ordinarily used 
for computing the rank of a matrix. Namely, a given matrix is 
transformed so that the rank remains unchanged but the matrix 
is changed to one in which the basis minor is immediately appa¬ 
rent. 
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§ 8. Finite-dimensional and infinite-dimensional spaces. Bases 

1. Definition 1. A linear space is said to be n-dimensional if it 
lias a linearly independent system consisting of n vectors, and any 
system consisting of a larger number of vectors is linearly depen¬ 
dent. 

The number n is called the dimension of the linear space. Thus, 
the dimension of a space is the largest number of its linearly in¬ 
dependent vectors. 

For example, the space of geometric vectors (see Section 2, Sub¬ 
section 1) is three-dimensional since it has three independent 
vectors, and any four vectors are related by a linear dependence. 
Geometric vectors located in one plane form a two-dimensional 
space, in which any two noncollinear vectors are linearly indepen¬ 
dent and any three vectors are linearly dependent. Vectors lying 
on a single straight line form a one-dimensional space. A linear 
space containing the zero vector 0 as its sole element is called 
a zero-dimensional linear space. 

2. All n-dimensional spaces (n = 0, 1, 2, 3, ...) form the class 
of finite-dimensional spaces. But this does not exhaust the set of 
all linear spaces. 

Definition 2. A linear space is said to be infinite-dimensional 
if for any integer N > 0 there exists in it a linearly independent 
system consisting of N vectors. 

Example. The linear space of functions continuous on a given 
interval (see Section 2, Subsection 6) is an infinite-dimensional 
space. To see that this is so, it suffices to consider the power func¬ 
tions 1, T, T^, ..., 

It is easy to establish their linear independence. Any linear com¬ 
bination of them is a polynomial of degree not higher than N: 

Oo -f a|T -f a.vt'v = p (t) 

But a polynomial with nonzero coefficients has only a finite 
number of roots, therefore p{x) = 0, i.e., {p(t)} = 0 if and only if 

Qo = Oi = 02 = ... = a,v = 0 

It has thus been demonstrated that the elements under considera¬ 
tion are independent and the space is infinite-dimensional since 
the number N may be arbitrarily great. 

3. We now introduce a definition that will be very important for 
wliat follows. 

Definition 3. A system of vectors e\, ..., in a space L is cal¬ 
led a basis if: 

(1) tile vectors c'l.are linearly independent; 
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(2) any vector x in L can be expressed linearly in terms of 
(’i, ..., e,,, that is, 

x = x^e^ + ... +x„e„ (1) 

An equation like (1) is called an expansion of the vector x in 
terms of the basis Ci, ..., e„; the numerical coefficients Xi, ..., x„ 
are termed the coordinates (or components) of the vector x rela¬ 
tive to that basis. 

4. Theorem. A linear space is said lo be n-dimensional if and 
only if it has a basis consisting of n vectors. 

Proof. (1) Let a space L be n-dimensional. This means that it 
has a linearly independent system of n vectors ei, . .., e„, and that 
if we add to it an arbitrary vector x of L, we get the linearly de¬ 
pendent system ei, .... e„, x. By Lemma 2, Section 6, the vector x 
is linearly expressible in terms of the vectors ei, ..., e„. There¬ 
fore, the system of vectors ei, ..., forms a basis in L. 

(2) Let L have the basis . .. e„. In L we consider an ar¬ 

bitrary linearly independent system of vectors b\, ..., bm- By the 
definition of a basis, each of the vectors bj can be linearly ex¬ 
pressed in terms of ei, ..., e„. Therefore, m^n by virtue of 
Lemma 1, Section 6. 

Hence, any system of vectors in L that contains more than n 
vectors is a linearly dependent system. At the same time, the basis 
C|, ..., e„ forms a linearly independent system containing n vec¬ 
tors. Thus, the dimension of L is equal to n. 

Remark. It is apparent from the foregoing proof that in an 
n-dimensional space any independent system of n vectors forms 
a basis. 

5. As an appendix to the theorem proved in Subsection 4, let us 
establish that the coordinate space Kn is n-dimensional. 

To do this we consider in Kn the vectors 

={1, 0, ..., 0}, 
e.) = {0, 1, ..., 0}, 

e„ = (0, 0, .... 1} 

According to the definition of linear operations in K,, (see Sec¬ 
tion 2, Subsection 3), any vector in /(„ can be linearly expressed 
in terms of the vectors et, ..., e„, namely, 

X = {x„ X 2 , ..., X,,} = X|C, -f Xa^a + • • • + (3) 

I'rom this it is clear that a linear combination of the vectors 
t'l. e„ is equal to 0 = {0, 0, ..., 0} only when all its coeffi- 
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cionts are equal to zero. Hence, the vectors e,, e„ are indepen¬ 
dent and form a basis for Kn, and so the space Kn is n-dimen- 
sional. 


§ 9. Linear operations in components 


J. Let a space L be n-dimensional with a basis formed by the 
vectors ei, ..., e„. 

Theorem 1. The resolution of a vector in terms of the given basis 
is unique. 

Proof. Suppose a vector a; in L has two resolutions; 


and 

Then 


x = x^e^+ ... 

x = Xiei + ... 

(x,—xi)ei+ ... +{x„ — x„)e„ = e 


and, since the vectors of the basis are linearly independent, it 
follows that 

Xi—Xi= ... =x„ —jc„ = 0 

whence 


x, =x. 




which completes the proof. 

Corollary. All components of the zero vector 0 are equal to zero 
for any choice of the basis: 

0 = 0 • ^1-1-0 ■ ^2-f- ... + 0 ■ (1) 

Theorem 2. When a vector is multiplied by a scalar, each com¬ 
ponent of the vector is multiplied by that scalar. In adding two 
vectors, we add the corresponding components. 

Proof. Given the vectors x, y. Expanding them in terms of the 
basis, we have 

x = x,et-\- ... 
y^yiei-’r ... +ynen 

Let a be an arbitrary scalar. By the axioms of a linear space 
we have 

cu = a(A:|e, 4- ... -f x„e„) = (cu;i)g,-f ••• +(aA:„)e„ 

Thus, the vector ax has components axi, ..., ax„. Furthermore, 

•v [ y -(x,e,-f ... -f + ••• +«/nO = 

= (•*^1 + t/l) + • • • +{Xn-\- yn)^n 
Hill I . Ilif vi'cbtr x-\-y has the components Xj + «/i.J^n.+ i/.n. 
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2 . Let a, b, q he an arbitrary system of vectors in L. Ex¬ 
pand each of them in terms of the basis: 

a = a^e^ -f ... + o„e„, 

b = 61^1 ^ 2 ) 

q = <71^1 + ... q„e„ 

Along with the vectors (2) let us consider the matrix M formed by 
their components: 

a, ... a„ 

yVf = 

q\ Qn 

The following theorem holds true. 

Theorem 3. The rank of the system of vectors (2) is equal to 
the rank of the matrix M. 

Proof. Suppose that the vectors of (2) are linearly related: 

(ta “h ... xq = 9 (3) 

Then from formulas (1), (3), Theorem 1, and Theorem 2 we have 
a{ai, ..a„} + P{6i. • • •. + 

... .9«} = {0, .... 0} (4) 

In other words, the rows of the matrix M are linearly related 
with the same coefficients a, p,..., x. Conversely, from (4) fol¬ 
lows (3). The reasoning is similar if instead of the entire system 
(2) we take some subsystem and the corresponding subsystem of 
rows of M (that is, rows containing the components of the vectors 
of the chosen subsystem). Therefore, a subsystem of vectors of the 
system (2) is linearly independent if and only if the corresponding 
subsystem of rows of the matrix M is linearly independent. This 
means that the maximum number of linearly independent vectors 
of the system (2) coincides with the maximum number of linearly 
independent rows of M. The proof of Theorem 3 is complete. 

3. If the number of vectors in the system (2) is equal to n, M 
becomes a square matrix. We then obtain the following corollary to 
the preceding theorem. 

A system of n vectors in n-dimensional space is linearly depen¬ 
dent if and only if the determinant of the matrix of the components 
of the vectors is equal to zero: 

A = det Af = 0 

that is, if the rank of the matrix M is less than n. 
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I( will readily be seen that this assertion actually does not differ 
from the theorem stated in Subsection 5 of Section 5. It is often 
used as a practical verification of the linear dependence or inde¬ 
pendence of specific systems of vectors. 

§ 10. Isomorphism between linear spaces 

1. Let there be given two linear spaces L and L' and a one-to- 
one correspondence established between them, that is, 

(1) to each vector a in L there is associated a vector a' in L'\ 

(2) distinct vectors in L have distinct images in L'\ 

(3) the images of elements of L fill U completely. 

Definition. The spaces L and L' are said to be linearly iso¬ 
morphic if a one-to-one correspondence can be established between 
them with the following conditions holding true; 

{a-\-bY = a' + b' (1) 

(aa)' = aa' (2) 

The one-to-one correspondence that satisfies conditions (1) and 
(2) is termed a linear isomorphism between the spaces L and L'. 

In other words, in a linear isomorphism the image of a sum is 
equal to the sum of the images, and the image of a product of a 
vector by a scalar is equal to the product of its image by that 
scalar. The algebraic and geometric properties of linearly isomor¬ 
phic spaces are absolutely identical. 

Remark. A linear isomorphism is possible only if the numerical 
factors in both L and L' are taken from one and the same al¬ 
gebraic field (for example, both spaces L and L' must be real or 
both complex). For instance, if L is complex and L' is real, the 
condition (2) cannot be fulfilled because the multiplication by com¬ 
plex factors that is admissible in L is not defined in L'. 

Theorem 1. For every n, all n-dimensional real spaces are li¬ 
nearly isomorphic among themselves. 

Theorem la. For every n, all n-dimensional complex spaces are 
linearly isomorphic among themselves. 

The proofs of Theorem 1 and Theorem la coincide formally, the 
sole difference being that the numerical factors are taken from 
different fields. Suppose L and L' arc both n-dimensional and both 
real or both complex. We choose an arbitrary basis in each of 
them: ..., e\, ..., c'e//. * 

Let X be an element of L. Expand it in terms of the basis: 

,v = ;c,e, 4- ... +.v„e„ 


* I Ilf svMitiiil I (If Holes the mcnibersliip of a given element in a given set. 
We wiile c, (= /. and rea(.l: "e, is an element of L'\ or “e, is in L". 
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Let us now associate with element x an element x' e U such 
that 

= ... 

This correspondence is one-to-one due to the theorem on the uni- 
(|ueness of resolving a vector in terms of the basis. Let us verify 
the conditions of the isomorphism: 

(1) (.f+ //)' = (AT,... +(.V„ + //„)< 

= {x/^ -+-...+ x/^) ... + y/^) 

= a :' + y'\ 

(2) (aA:)' = (aAr,)e;-f ... -f (cocje; 

= a(.v,<+ ... = 

We see that the established correspondence between L and V 
satisfies the conditions (1) and (2). This completes the proof of 
the theorem. 

Remark. Actually, what has been proved is that any two linear 
spaces of the same dimension over one and the same algebraic 
field are isomorphic. 

2. Because of the theorem just proved, all n-dimensional real 
linear spaces are isomorphic to the real coordinate space Kn\ all 
n-dimensional complex spaces are isomorphic to the complex 
space Kn- Thus, without any loss of generality, in the theory of 
n-dimensional linear spaces we can confine ourselves to the study 
of Kn spaces. 

3. Theorem 2. A linear space isomorphic to an n-dimensional 
space is itself n-dimensional. 

Proof. Given an n-dimensional linear space L. Let L' be a space 
isomorphic to L. We will first prove that under a linear isomor¬ 
phism the image of the zero element 0 e L is the zero element of 
the space L'. For this purpose we take an arbitrary element a' e L' 
and its original (preimage) a ^ L. Since 



a = a + 0 


it follows that 

a' = (fl + 0)' 

(3) 

But by the definition of 

an isomorphism, 



(a 4- 0)' == a' -f 0' 

(4) 


From (3) and (4) it follows that a' B' = a'. Therefore, the 
image 0' of element 0 is the zero element of the space L'. 
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It will now be shown that if we lake an independent system of 
vectors in L, then their images will be independent vectors in L'. 

Let a, b, ..., q &n independent system in L. Consider the re¬ 
lation 

aa'-fpfr'-f ... +K< 7 ' = e' (5) 

By the definition of an isomorphism, (5) can be rewritten thus: 

(ctfj -f- ... -f- y.qY = 0^ 

And since the preimage of the zero element is the zero vector of L, 
it follows that 

art -f +•••+»<? = 0 (6) 

By virtue of the linear independence of the vectors a, 6, ..., ^ in 
L, it follows from (6) that 

a = p=...=x = 0 (7) 

Thus, from (5) follows (7). Hence, the vectors a', b', ..., q' are 
independent in L'. Since the space L is n-dimensional, it has n 
linearly independent vectors. Their images in L' are also indepen¬ 
dent. Hence, the dimensionality of L' is not less than n. In this 
discussion, we can interchange L and L' to find that the dimen¬ 
sionality of L is not less than that of L'. Therefore, U has the di¬ 
mension n and the theorem is proved. 

Corollary 1. Finite-dimensional spaces of unlike dimensions are 
not isomorphic. 

Corollary 2. An infinite-dimensional space is not isomorphic to 
any finite-dimensional space. 

§ II. Correspondence between complex and real spaces 

I. Finite-dimensional complex and real linear spaces stand in a 
relation to one another that we will now discuss. We begin with 
an e.xample. 

(ieometric vectors located on a single straight line form a one- 
(limeiisional real linear space. This is because an arbitrary nonzero 
vector multiplied by a real number can be transformed into any 
other eollinear vector. 

(ieometric vectors located in a plane form a two-dimensional 
le.'il space. Here, a fixed vector can no longer be transformed into 
any oilier vector by multiplication. The supply of real factors is too 
MU.ill eompared with the diversity of vectors making up that 
sp.Kc. and so two vectors may prove to be linearly independent. 

I 111' ' iipply of complex factors is twice as rich. Therefore, multi- 
|)lii .ilioii of veelors by complex numbers may be defined so that the 
colli I lion Ilf geometric vectors in the plane turns into a one-di- 
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mensional complex space. This requires the possibility, via multi¬ 
plication, of transforming any nonzero vector of the given plane 
into any other vector of the same plane. 

This problem can be solved if we define the product of a geo¬ 
metric vector by a complex number in the following manner. 

Let a be an arbitrary vector in the plane. We assume that it is 
laid off from the coordinate origin. Now let a = p(cos cp-|-/sin cp) 
Ire a complex factor. Turn vector a around the origin of coordinates 
through the angle cp and then multiply it by the real number p. 
Denote the resulting vector by b and set aa = b. As before, we 
add vectors by the parallelogram rule. 

With that definition of addition and multiplication, all the 
axioms of a linear space hold true. To see this, it suffices to note 
that the complex numbers themselves are depicted by vectors in 
the plane and that here addition of vectors and multiplication of 
a complex number a by a vector a are defined in exactly the same 
way as we ordinarily define addition of complex numbers and mul¬ 
tiplication of a complex number a by a complex number a. There¬ 
fore, in our case the axioms (l)-(8) hold true since they hold for 
complex numbers. Now any single nonzero vector forms a linearly 
independent system, and any two vectors are linearly dependent 
(since multiplication includes a rotation) so that the resulting 
complex space is one-dimensional. 

2. We have seen that a one-dimensional complex space and a 
two-dimensional real space can be constructed out of the same 
objects, namely, out of vectors in the plane, with addition of 
vectors being defined in both cases identically. 

Multiplication is defined differently, which is unavoidable since 
the supplies of the factors differ. Note however that multiplication 
by real numbers is performed in the same manner in these spaces. 

3. It is easy to see that the foregoing example is a special case 
of a more general phenomenon: with every complex linear space 
is associated a real space of twice the dimensionality of the com¬ 
plex space; also note that though the correspondence is not an 
isomorphism, it very much resembles an isomorphism. Namely, the 
following theorem holds. 

Theorem. A complex linear space C„ of dimension n may be 
mapped one-to-one onto a real linear space L 2 „ of dimension 2n so 
that the condition 

(a-f = 6' (I) 

holds, and for real factors K the following condition holds: 

{ka)' = ka' 


(2) 
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Remark. As in Section 10, the prime indicates the image in L 2 ,, 
of an clement of C„. 

Rroof. According to Section 10, all n-dimensional complex spa¬ 
ces are isomorphic. We can therefore confine ourselves to any one 
of them. For the C„ space let us take a complex coordinate space. 
Let 

a = + «2. -^3 + . X2n-\ + iX2n} 

be any element in C„. With this element we pair the element 

a'= {X^, X 2 , X 3 , X^, ..., X 2 a-\, X 2 n} 

taken from the real coordinate space which plays the role of L 2 n- 
Since the decomposition of a complex number into the real part 
and the imaginary part is performed in unique fashion, the corres¬ 
pondence established between C„ and L 2 ,, is one-to-one. The truth 
of (1) and (2) for real A is obvious. 

Remark. For n = 1 we have 

a = {x + iy}, a' = {x,y} 
which returns us to the original example. 

4, In the sequel, we will assume all geometric vectors to be ele¬ 
ments of real space. 

§ 12. Linear subspace 

1. Let L be a linear space and £ a certain set of elements in L. 

Definition. The set £ in the space £ is called a linear subspace 

if the following conditions hold true: 

(1) for any x, y in C their sum x y also lies in £; 

(2) for any x ^ L and any scalar a, the product ax e £. 

Remark. For the sake of brevity we will often say subspace in¬ 
stead of linear subspace. 

2. Lei £ be a linear subspace of £. The operations of addition 
of vectors and their multiplication by scalars given in £ will be 
considered relative only to those elements that enter into £. Then 
the following theorem holds true. 

Theorem 1. Every linear subspace L of a linear space L is itself 
a linear space. 

Proof. By the definition of a subspace, addition and scalar mul¬ 
tiplication are closed operations in £. The axioms (l)-(2) and 
(5) -(8) of a linear space are certainly fulfilled in £ since in ge¬ 
neral they hold true for all elements of £. Therefore, the proof only 
requires the verification of axioms (3) and (4), that is, we have to 
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cslfiblish dial together with every element x of L the subspace L 
includes the additive inverse element — x and that 0 e £. 

By the second condition in the definition of a subspace we have 

— v = (— 1 ) • -veZ 
Using the first condition, we get 

0 = -v + (—,v) e L 
which completes the proof. 

3. The intersection of a collection of sets is the collection of 
those elements that belong simultaneously to all the sets under 
consideration. The intersection of two sets M' and ^ is denoted by 
the symbol si- {\ 0^. This notation will be used frequently in the 
sequel. 

Theorem 2. The intersection of any colieciion of subspaces of a 
given linear space L is also a linear subspace. 

Proof. For the sake of simplicity, the proof will be carried out 
for the case of two subspaces Li and L 2 . Let Z-a = Li (1 L 2 , and 
let the vectors x, y lie in L 3 . When regarding x, y as elements of 
Li, we find, by the definition of a subspace, that x y ^ L\, 
ax e Li (a an arbitrary scalar). In exactly the same way, 
X y ^ L 2 , ax e L 2 . But this means that x + i/ e L 3 , ax e L 3 
and therefore L 3 satisfies the definition of a subspace. This com¬ 
pletes the proof of Theorem 2. 

4. Examples of subspaces. (1) The set L consisting of the single 
zero element 0 of a given space constitutes a subspace, for 

0 “F 0 = 0 ^ L, a0 = 0 L 

(2) In the n^-dimensional space of square n X « matrices, the 
set of symmetric matrices l|a,kll, that is, such that a,ft = a/,,-, forms 
a subspace. 

The set of skew-symmetric matrices, that is, such that a,-ft = 
= —Oftf, also forms a subspace in the space of n X n matrices. 

( 3 ) In the space of all possible functions specified on the in¬ 
terval Ti ^ T ^ T 2 , each of the following sets forms a linear sub¬ 
space: 

(a) the functions continuous at some interior point to of the 
interval ti < t < T 2 : 

(b) the functions continuous in the interval ti < t < T 2 : 

(c) the functions continuous on the entire interval [ti, T 2 ]; 

(d) the functions continuous on the interval [ti, T 2 ] together 
with their derivatives up to order N inclusive, where N is an ar¬ 
bitrary positive integer; 
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(c) the functions having derivatives of all orders on the interval 
Iti, T 2 ]; 

(f) all polynomials considered on the interval [ti, T 2 ]; 

(g) polynomials of degree not exceeding a fixed integer N >0. 

Each of the subspaces enumerated in Example (3) is contained 

in the preceding one, and all of them, with the exception of the 
last one, are infinite-dimensional (the last one is of dimension 
1 ); 

(h) on the interval [ti,T 2 ] fix an arbitrary set of points The 
functions equal to zero at points of the set also form a subspace. 
By way of an exercise we leave it to the reader to figure out how 
the dimension of this subspace depends on the choice of the set 

(4) Let L be the three-dimensional space of geometric vectors 
in ordinary Euclidean space. We assume the vectors to issue from 
the coordinate origin. Let us consider all the vectors located in 
some plane passing through the origin. These vectors form a sub¬ 
space. 

It is left to the reader to prove that the above-mentioned sub¬ 
spaces do indeed satisfy the definition of Subsection 1. 

5. The following are instances of subsets of a linear space that 
are not subspaces. 

(a) In the three-dimensional space of geometric vectors we con¬ 
sider the collection of vectors whose termini lie in a fixed plane 
not passing through the coordinate origin. They do not form a 
subspace since both the sum of two vectors and the product of a 
vector by any scalar ^ 1 are not members of this subset. 

(b) In the same space we consider vectors whose termini lie on 
the surface of a cone with vertex at the coordinate origin. The pro¬ 
duct of any vector of this set by any scalar is also a member of 
this set. Nevertheless, the indicated set is not a subspace since, 
generally speaking, it is not closed under the operation of addi¬ 
tion. 

§ 13. Linear hull 

I. Suppose in a linear space L we have a system of vectors 
«i. a,,. 

Definition. Tlie set of all linear combinations of the form 
x = a^a^-\- ... +0*0* 

is called a linear hull of the given system and is denoted by the 
symbol /-(oi, .. . , a*). 

We sometimes say that L{ai, .... a*) is a linear hull spanned by 
the vi'ctors Oi, . .., «*. 
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Theorem 1. The linear hull of any system of vectors is a linear 
subspace of the space L. 

Proof. Let us take arbitrary vectors x and y from the linear 
hull L (Oi, .... aft): 

.r: = a|a, + ... + s Z, (a,, .... au), 

t/ = Pl^l+ ••• .... flft) 

Then 

a: + f/= (a, 4-Pi) fli + ••• + (oft + s ^ (fli. •••> Ok) 
Besides, for any scalar K we have 

Zji: = (A,a|)a, + ... + (XaJa*eL(a,, .... a*) 

2. Remark. The linear hull L (oi, ..., ah) may coincide with 
the entire space L (for example, if ai, ..., is a basis in L). 

3. Theorem 2. If every vector of a system Ci . c,„ is linearly 

expressible in terms of the vectors of the system a\,...,ah, then 

Lie .. cJczL(ai, ..., ak)* 

Proof. Let x^L (ci, ..., Cm); that is, suppose x. can be ex¬ 
pressed in terms of Ci, ..., Cm- Then, by Property (6), Subsection 2, 
Section 4, the vector x can be expressed in terms of oi, ,.., a*. 
Hence x e L (Oi, ..., a/,). Thus 

L (Cff ..., C;jj) L (O], ..., at) 

Corollary. The linear hull of any subsystem of a given system 
of vectors is included in the linear hull of the entire given system. 

Theorem 3. If the system a\, ..., Uh has rank r > 0, then any 
one of its linearly independent subsystems consisting of r vectors 
is a basis in the linear hull L (oi, ..., at,). 

Proof. The system at .a/, of rank r (r > 0) has a linearly 

independent subsystem consisting of r vectors. 

For the sake of definiteness, suppose that the first r vectors of 
the given system are linearly independent. Then, since the rank of 
the system at, ..., ah is equal to r, it follows that each of the 
vectors at, ..., ah is linearly expressible in terms of the vectors 
ui, ... ,ar. From this and by Theorem 2, 

Liat, ..., ar, Or-t-t, ..., (Ik) <= L(ai, ..., a^) 

On the other hand, by the Corollary to Theorem 2, 

L {at f ..., a/.) cr L {aty ..., af.t~t> • • *» Oh) 


* The symbot cr indicates the inclusion of the first of the sets in the second. 
.\4 ci96 is read as “si is contained in SS" (it is not precluded that si may coin¬ 
cide with 31). 
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Hence L(a\ .a,-, a,+i. • ■ •. coincides with L(a\ . Ur). And 

so every element x in L(a\, au) can be resolved in terms of 
ai, .... Of. For this reason and because of the independence of the 
vectors Ui, .... a,, they constitute a basis in L(a\ .a/,). 

Theorem 4. If the rank of the system Oi, ai, is equal to r, 
then L(a\, ..., U/,) is an r-dimensional subspace. 



Fig. 1 


Proof. Suppose that r > 0. Then, by the preceding theorem, 
there is a basis in L(ai, ..., a*) consisting of r elements, whence 
and by the theorem of Subsection 4, Section 8, L(ai, ..., a*) is of 



Fig. 2 


dimension r. Suppose r = 0. Then Oi = ... = a* = 0. But then 
L(a .. Oh) includes only 0 and, hence, has the dimension 0. 


4. Let us consider some examples. (1) Let o, b, c {a ^ Q) be 
geometric vectors lying on a single straight line. Then L(a, b, c) = 
= L(a) (Fig. 1). 

Here, L{a) is a one-dimensional subspace consisting of all vec¬ 
tors lying on the given straight line. In this subspace, the vector a 
constitutes a basis. 

(2) Let a, ft, c be geometric vectors with a and b not collinear, 
( — a 4 - I). In that case L{a, b, c) = L(a, b) (Fig. 2) so that an 
arbitrary vector x L(a,b,c) can be represented as x = aa-\- 
+ 
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Here, L(a, b) is a two-dimensional subspace that consists of all 
vectors coplanar with the vectors a, b. The vectors a, b constitute 
a basis in L{a, b). 

(3) Let the functions I, t, .be regarded as elements 

of a linear space L of continuous functions specified on the interval 
|ti, Ta]. Then L(l, t, , t^') is a subspace of L consisting of all 

polynomials of degree not higlier than N. 

5. In conclusion we note an obvious but important proposition: 
liny subspace of a finite-dimensional space is a linear hull of some 
system of elements. 

To prove this, it suffices to note that any subspace £ of an /i-di- 
mcnsional space L is finite-dimensional (and has dimension ^ n\ 
Ihis is clear since the maximum number of linearly independent 
elements in £ cannot exceed those in £). But then either £ is zero¬ 
dimensional, and then £ = £(0), or L has the basis q\ . q,, 

and then £ = L(q\, ..., qr). 

In the last case, the subspace £ consists of vectors (and only 
such vectors) that have the form 

x = t^q, + ... +£9, (1) 

where , t,- are arbitrary scalars. 

Suppose in the space £ we have a basis e\, ..., e„, and x = 
Xi^i + ... + x„e„, = (/i.ei+ 921 ^ 2 +•••+ (here i = 

— 1. r\ x\, ..., x„ are the components of the vector x\ and 

i/k, < 72 ,•. qni are the components of the vector qt). 

Then formula (1) may be replaced by the relations in compo¬ 
nents: 


■»^i = <711^1 + q\2h 4* • 

• 4- qirtr< 1 

Xn = (lnd\ q 112(2 4- • 

• 4 - qnr^r ^ 


Relations of the form (2) are called the parametric equations of 
Iho subspace L{q\, ..., qr). 

§ 14. Sum of subspaces. Direct sum 

I. In a linear space £ let there be given two linear subspaces 
/.i and £ 2 . We denote by £ the set of all vectors x that can be re¬ 
presented in the form 

X = X| X2 

where Xi e £ 1 , X 2 e £2 (Fig. 3). It is readily seen that £ is a li¬ 
near subspace of £. Indeed, together with xe £ take another vec¬ 
tor x' e £, that is. 


x'=x; + x; 
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where then the vector 

x + x' = (^, + x;) + (.V2 + Ar0 
belongs to L since .v,+ A:^eL,, x^-\- Besides, 

ax = axtaXi^L since cUiSLi, aX 2 ^ L 2 

The subspace £ is called the sum of the subspaces L\ and £2 and 
is denoted thus: £ = £i + £ 2 . Fig. 3 depicts a special case where 
£ = £ is three-dimensional and £1 and £2 are two-dimensional. 



2. The notion of a sum of subspaces carries over directly to any 

number of terms. Given in space £ the subspaces £1 .£?,; their 

sum 

£ = £i -|- £2 ”1" • • • + £* 

is then a linear subspace consisting of all vectors of the form 

x = jC|-+- ... -f-x* where A|e£|, ..., Xk^Lk (I) 

3. Definition. If for every x^L the resolution (1) is unique, 
then £ is called the direct sum of the subspaces £ 1 , ..., £>,. 



For the direct sum we use the symbol 0, for example, 

£ = £i0£20 ... 0£* 

We will use the symbol 0 in cases where it is necessary to stress 
that we are dealing with a direct sum. 

By way of an illustration, Fig. 4 depicts the direct sum of the 
one-dimensional subspaces £i and £ 2 . Note that the sum £1 + £2 
in I'ig, .3 is not a direct sum. 
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In the next two subsections we give the conditions that are 
necessary and sufficient for a sum of subspaces to be a direct 
sum. 

4. Theorem I. The sum L = L\is a direct sum of 
the subspaces L|, ..., Li, if and only if not one of the subspaces 
L\, ..., Lh has any elements, except 9, in common with the sum of 
the remaining elements. 

Proof. (1) Suppose it is given that the intersection of each of the 
subspaces under consideration with the sum of the remaining ones 
consists solely of the zero vector 9. We will prove that L = 
= Li 0 ... 0 L;,. Suppose that there are two resolutions of the 
vector X e L: 

x — X\-\-X2-\-..--\-Xk, X = x^x^-fr • • • "V Xk ( 2 ) 

where Xj e Lj, Xj e Lj. It is required to verify that 

Xi = Xi (3) 

for each of the numbers j. From (2) we have 

9 = (^1 — .V|) + (.V 2 — ^ 2 ) + ■ • • + (•«* — Xk) (4) 

Set t/i = —(xi — xi). Then yi = Xi — Xi ^ L,, y\ = (Xi —X 2 ) + ... 

...+(xk —xi,)^ L 2 +... + Lh and therefore y\ = Q, that is, 

X| = X|. 

Now, introducing y 2 = —(^2 — X 2 ) and taking advantage of (4) 
and also of the fact that L 2 (1 (L| + + • • ■ + L^) = 9, we get 

X2 = X2. 

The remaining equations of (3) are proved similarly. 

(2) Suppose that for any xeL the resolution (1) is unique. 
We will demonstrate that, for example, L| does not have any com¬ 
mon elements with L 2 + ... + L;, except 9. Suppose the opposite, 
that is, that there is a z = 5 ^ 9 such that z e L|, z e L 2 -f ... + L/,. 

But then z = Z 2 + ... -f- z/., where Z 2 e L 2 , ..., Zh e Lh. We can 

therefore write 

9 = z + (—l)z 2 + ... -f-{—l) 2 r* 
where z e L\, Z 2 e L 2 , ..., z/, e Li,. On the other hand, 

9 = 9 + 9+ ... +9 

Thus, for 9 e £ we have obtained two distinct resolutions of the 
form ( 1 ), which is a contradiction of the hypothesis. 

5. Theorem 2. £ = L| + ... + £/, is 0 direct sum of the sub¬ 
spaces Ly, ..., Lh if and only if every system of nonzero vectors 

a . ah taken one at a time from each Lj {i.e., Lj, j = 

— 1, ..., /e) is linearly independent. 
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Proof. (I) Let L be the direct sum of the subspaces L|, L*. 

We lake arbitrary nonzero vectors a\, an one at a time from 
each Lj We will prove that they are linearly indepen¬ 

dent. Suppose that the system Oi, .... a* is dependent. Then one of 
these vectors is linearly expressible in terms of the others. For the 
sake of definiteness we assume that oi is linearly expressible in 

terms of 02 .a*. But then this vector belongs to Li and to the 

sum L 2 + •. • + Lft, which runs counter to Theorem 1. 

(2) Suppose any system of nonzero vectors a .. a* taken 

respectively from L\, ..., Lh is linearly independent. We will prove 
that /' = Li + Lfc is a direct sum. Assume the contrary. Then, 
by Theorem 1, one of the subspaces L\, ..., Lh has a nonzero vec¬ 
tor in common with the sum of the remaining ones. For example, 
suppose the nonzero vector a\ belongs to L\ and L 2 -f ... -f 
Then a, = a'-l- ... -|- 0 ^ 02 ^ 7 . 2 , .... In place of this 

relation we can write a\ = e 2 a 2 + ... -f e/,aft, taking ei = 0 in the 
case aj = 0, and, in this case, taking for Uj any vector from Lj so 
long as it is not equal to 0; but if a\ 0, then we assume that 

= 1 and a^ =a\. 

Thus are indicated the nonzero vectors Oi, ..., Uu (fli e L,) which 
are connected by a linear dependence. The result is a contradiction 
with the hypothesis of the theorem. 

6. If the space L itself is resolved into the direct sum of its 

subspaces Lx, , Lh, then each vector x is uniquely resolved into 
its components Xi . Xh lying respectively in Li, ..., Lh. 

In particular, if e\, ..., is a basis in L, then L can be re¬ 
solved into the direct sum of one-dimensional subspaces: L = 
— L\ Q)... Ln, where Li is the linear hull of the basis vector e; 
(that is, Li consists of vectors obtained by multiplying e, into all 
possible scalars). 

7. Theorem 3. Given in a linear space L the subspaces Lh and 
Li of dimension k and I respectively. If their intersection is of di¬ 
mension m, then the dimension of their sum Lh + L, is equal to 
r = /<: + /- - m. * 

To prove Theorem 3 we will need the following lemma. 

Lemma. In n-dimensional space, any independent system of vec¬ 
tors less than n may be completed to constitute a basis. 

Proof of lemma. Let ex, ..., eh be an independent system of 
vectors, k < n. There will be at least one vector eh+x such that 

t’l. Ch, eh+x is also an independent system. If there were no 

such vector t'h+x, llien any vector of L could be expressed in terms 
of Cl.c/,, hut this contradicts the hypothesis li < n. 


In particular, foi in = 0, the sum Lk -L Li will be a direct sum. 
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If k \ a n, the above argument may be repeated and it is 
possible to adjoin one more vector to the system without disrupt¬ 
ing the linear independence. This procedure may be continued 
until the number of vectors in the system reaches n; then it will 
turn into a basis. The proof of the lemma is complete. 

Proof of Theorem 3. Put L,„ = L* fl L; and choose in the sub¬ 
space Lm a basis ei, , e,„. Using the lemma, complete the basis 
to the status of the bases in subspaces Li, and Li\ 

^ 1 , . •., ejyi (. 1 , ..., basis in 
e .- basis in 


Comparing the definition of a sum 
nition of a linear hull (see Section 13), 

of subspaces with the 
we find that 

defi- 

Lf^-f- Li = L(e^, ..., e^, . 


(5) 

We will prove that the vectors 



a,, .... e^, ..., 

> ^m+l’ • • * • ^Z 

(6) 

are linearly independent. Assume the contrary. Suppose 
exists the nontrivial relation 

there 

01^1 -j- . . . -j- am^m "1" ®m+l^m + l 



+ ... +afee* + a^+ie^+, + ••• = % 

(7) 


Among the numbers as+i, ..., ar there are those that differ from 
zero, otherwise the vectors e\, ..., Cm, ^m+i, •. ■, that form a 
basis in Lu would be linearly dependent. Set 

«A+I«m+I+ ••• = (8) 

From (8) it follows that x e L/, and from (7), that x ^ Lh; 
therefore jc e L/ (1 Lu. Hence, x is linearly expressible in terms of 
the vectors ei, ..., e,,,. Thus we have a relation of the form 

+ ••• +«r^Z=P|«l+ ••• (9) 

Equation (9) signifies a nontrivial linear dependence among the 
vectors e,, ..., e^, ^'m+v •••> which is impossible since the 
indicated vectors form a basis in L/. This is a contradiction, which 
completes the proof qf the independence of the vectors (6). Now, 
relation (5) signifies that the vectors (6) form a basis in Lu + Li. 
Hence, the dimension of Lu + Li is equal to the number of vectors 
in the system (6), that is, to the number r = k -\- I — ni. The proof 
of Theorem 3 is complete. 

8. Theorem 4. The dimension of a direct sum of subspaces is 
equal to the sum of (he dimensions of the summands. The union of 
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any bases taken one at a time in each summand forms a basis in 
the direct sum. 

Proof. If there are two summands, then the first assertion of 
Theorem 4 follows from Theorem 3 with account taken of Theo¬ 
rem 1, as a consequence of which m — 0. 

The second assertion of Theorem 4 in the case of two summands 
follows from the proof of Theorem 3, more precisely, from the fact 
that the system of vectors (6) is independent (now in the notation 
of (6) we have to put m=0 and strike out the vectors ei, ..., Cm). 

Further note that if L| + L 2 + L 3 + • • • + is a direct sum, 
then L 1 + L 2 , L|-f ^2 + ^3 = (^-1 + ^- 2 ) + ^- 3 . etc., are also direct 
sums. Therefore, in the general case, both assertions of Theorem 4 
are proved by induction. 


9. Finally, we note the following associative property for direct 
sums: if 


then 


L ==■ L\@ L, 


(I) 

Z. = Z,2 © ••• 

©Z., 

(II) 

= Z.| © Z,2© 

... ©Z., 

(III) 


Proof. Let x be any element in L. We have x =x\ + x, where 
Xi e Li, i e L Since x ^ L, it follows that jc = X 2 -f ... + 
where X 2 e L 2 , ..., x* e L^. Thus, for every Jt in L we have 

X — X\-\-X<i-\- ... -f-Afft (*) 

where Xj^Li. Conversely, from (•») it follows that x^L. We will 
prove the uniqueness of (*). Let 

x = x\ + x'2+ ... -f a:^, .Vi'eL, 

whence X = + x'. Here x' = X 2 + ... x'^. Consequently, jc'eZ. 

Therefore and by the definition (I) we get x\ = x^, x' = x. From 
the last equation and by definition (II) we find x\ = x^ for 
/=1, 2, ..., k which proves (III). 

This property has to be used when the resolution of a space (or 
subspace) L into a direct sum is performed in succession: L = 
= L\ 0 U, L' = L 2 © L", and so on (see for example Chapter Vll, 
Section 10). 
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LINEAR TRANSFORMATIONS 
OF VARIABLES. 
TRANSFORMATIONS OF COORDINATES 


§ 1. Abbreviated notation for summation 

1. In the sequel we will often have to do with sums in which the 
summands are denoted by a single indexed letter; for example, 
Om + flin+i + • • ■ + On- In such cases it is convenient to use the 
following abbreviated notation for the sum: 

N 

nm + ^m + l+ • • • Ofj = Oi= Yi Oi 

i =am 

(read: “the sum of a, from i = m to N"). 

2. Properties of the summation symbol. 

N 

(1) YjO = Na, since there are N identical terms equal to a for 
every L 

(2) A common factor can be taken outside the summation sym¬ 
bol: 

N N 

Y, CUi^ C Y, Oi 

i—m i=m 

N N N 

(3) Y (Oi + bj) = Y Oi Y 

i=m i=m i=m 

(4) The magnitude of the sum does not depend on the letter 
used for the summation index: 

N 

Y ni = «m + <^m+l+ ••• 

i—tn 

N N 

— Y oi=^ Y Ok 

J=in k=in 
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(5) If the summation is over two distinct indices, each of wliich 
varies independently of the other, the order in which they are 
summed is immaterial: 


z 




aii= L 1 L an) 

i=mi \l=m. / 



When summing over different indices, one ordinarily drops the 

Nx / N, \ Nx N; 

parentheses and in place of Z I Z Oti ) one writes X Z «//. 

issmi\/=mj / »=m, /=m. 

It is assumed that the terms a,-; are first summed with respect to / 
with I held constant (inner sum), and then the resulting quantities 
are summed with respect to i (outer sum). The fifth property car¬ 
ries over to the case of summation with respect to three or more 
distinct indices. 

Note that if the range of one index depends on another summa¬ 
tion index, then upon any change in the order of summation the 
range of each of the indices is, generally speaking, different. In 
particular, 


n n 

Z Z an = 

i=\ /=i 


z 


dj] '■ 


n / 

zz 

/=! i=l 


an 


(6) When summing over two (or several) indices, a factor that 
does not depend on the index of inner summation may be taken 
outside the sign of the inner sum: 

Nx N, N, Nt 

Z Z aiA= Z ft/ Z an 

i=mx l=m, i=mx l=m, 

The aforementioned properties of the sigma (summation) symbol 
follow directly from the rules of arithmetic operations and are 
made frequent use of in the sequel. 


3. To abbreviate notation, let us agree that if the range of a 
summation index is not indicated, it is assumed that the summa¬ 
tion is from I to n; for example, 

Z a, = Z 0 / 

/ /=! 

Besides, if the summation from 1 to n is over several indices that 
are independent of one another, we will write one summation 
symbol and under it all the indices over which the summation is 
to be extended. Tliat is, 

n n n 

Z ai/hi = Z Z Z ai/ki 

i, /. k i — > i=\ A=»l 
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The point is that later on we will very often have to do with 
summing from 1 to n, where n is the dimension of a space. 

Let us further agree that if the summation indices under the 
summation symbol are not indicated at all, this means that the 
summation is to be carried out with respect to all the indices that 
appear twice under the summation symbol, and the summation is 
from 1 to n with respect to each of these indices. For example,, 

£ a,&i = a,b, -f + ••• + anb„-, 

n n n n 

Z Aiatix, = Z E AiOiiX, = E Z (til'll 

i=l /=! , = | /=1 


Note the inner sum in the right member of the last equation. The 
general term of this sum depends on two indices, but the summa¬ 
tion is carried out only with respect to one of them (the index /), 
so that the result of the sum depends on the other index (the 


number i). Putting //, == E and taking advantage of the ab¬ 
breviated notations, we can write 


f/i = E atjXj 


The indices with respect to which the summation is performed 
are often called dummy indices (or umhral indices). By Property 
(4), Subsection 2, a dummy index may be changed in the course 
of a computation, as for example, 

Ui = E atiXj = E a.a-^a- < = 1. .. •, « (I)! 


Indices over which summation is not performed are ordinarily 
called free indices. It is important to see that the free indices in 
the right and the left member of every equation are denoted in the 
same way. For example, the equations (1) may be replaced by the 
following equivalent notation: 

f/S = E (IklX/ = E «Aa^a. k=\, It 


but it is not permissible to change the notation of the free index in 
only one member of an equation. 

Later on we will seq (Chapter V) that in many cases it will be 
convenient to write the indices as superscripts (x\ a*'', and so 
forth). It is important that in the course of a computation the su¬ 
perscripts remain superscripts and the subscripts remain subscripts 
if they are free indices. 

Later on we will have to do with summation symbols carrying 
several dummy indices. In such cases, the independent dummy in¬ 
dices must be denoted by distinct letters,. 
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For example, 

X BaXfiyy'' = zp, X Bk (X A'a^vy'') = X A'-z^ 

The indices /' and p remain free (this means that the preceding 
line replaces equations). 

4. Remark. Considerable use in the literature is made of a still 
more compact notation in which not only the limits of the summa¬ 
tion and the indices are dropped but even the summation symbol 
as well. In that notation, we have, for instance, 

AkZ^ = A{z^ 

ft=i 

The use of such notation requires well-developed habits on the part 
of the reader. In this text we will not drop the summation symbol. 

§ 2. Linear transformation of variables. The product of linear 
transformations of variables and matrix products 

1. Let X|, ..., x„ be an ordered n-tuple of independent variables. 
Suppose we have an m X « matrix: 


bn . 

■ b,n 

bm\ ■ 

• f’mn 


We can write the relations 

'Ji + ••• A-binX„, 

Urn = b,n\X\ + ... A- b 


( 1 ) 


where y\, ..., y,,, denote the numerical values of the right members 
of (1). It is clear that y\, , y,„ vary with ati, ..., x„. 

The set of relations (I) is called a linear transformation of the 

variables . .. x„ into the variables «/i, ..., The numbers 6,* 

are called the coefticients of the linear transformation (1). The 
matrix B made up of these coefficients is termed the matrix of the 
given linear transformation. When specified, the matrix determines 
the linear transformation (1). 

Remark. The ordered n-tiiple of variables X\ . Xn might be 

regarded as a variable point in coordinate space. We can interpret 

... geometrically in exactly the same way. However, it is 

advisable to define a linear transformation of variables as a purely 
arithmetic (or algebraic) concept and not relate it beforehand to 
any geometric conceptions. 
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2 . Given a p X m matrix 


On . 

• ^Im 

Up] . 

• ^pm 


Let us write down the corresponding linear transformation of va¬ 
riables in the form 


= ailf/l + • ■ 

• • + Ulm'/m. 1 

(2) 

= api.Vi + • 

* • “h ^pmUm ^ 



Here, the independent variables are denoted by i/i, ..., (/„„ 
although the relations (2) are at first considered irrespective of 
relations (1). 

At the same time, we might regard the y\, ..., «/,„ in the right- 
hand members of (2) as the same quantities defined in (1) with 
respect to the variables X|.x,„ in which case 2 |, ..., Zp be¬ 

come functions of the independent variables X], ..., x„, and 
t/i. Pm have the role of intermediaries. 

If the intermediate variables t/], ..., i/m are eliminated from (1) 

and (2), then . . . Zp will be expressed explicitly in terms of 

X\, ..., Xn- To carry out this operation, replace the yj in the right 
members of (2) by their expressions in (1). Then in each equation 
of (2) each of the variables X], ..., Xn will occur m. times. Collect¬ 
ing like terms and denoting the resulting coefficients by cn, <^ 12 , • • •. 
we have 


?! =Ci,X, + ■ 

• “h ^\n^ny ^ 

Zp = CpxXx + . 

• A-CpnXn ^ 


We thus get another linear transformation of the variables; the 
matrix is 


Cii . 

• ^l/l 

Cpx . 

• ^pn 


Definition. The linear transformation of variables (3) obtained 

by eliminating . .. from (2) and (1) is called the product 

of the linear transformation of variables (2) by the linear trans¬ 
formation of variables (1). Here, the p X « matrix C is termed the 
product of the p X ni matrix A by the in X n matrix B. Symboli¬ 
cally we have 

C = AB 


3. Let us now find a formula to express any element of the 
matrix C in terms of the elements of the matrices A and B. To do 
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lliis we have to actually eliminate the quantities . .. t/,„ from 

the relations (1) and (2), and not merely make that statement. 
To reduce the amount of computation, we write system (1) more 
compactly as 

n 

«// = Z b,kXk 

I 

where / = 1, 2, m correspond to the numbers of the equa¬ 
tions. System (2) is written down similarly: 

rn 

/=! 

where i = I, p. We now have 

m / n \ n / tn \ 

2/ = Z «</ Z I’tkXkJ = Z Z aiibikj Xk (4) 

•On the other hand, we can abbreviate the equations (3) to 

n 

2i = Z CikXk (5) 

From (4) and (5) we get 

m 

— ( 6 ) 

or 

= ^^iI^IA + fli2^24 + • • • + (ilmbmk (7) 


Expression (7) is called the product of the tth row of matrix A 
by the feth column of matrix B (by analogy with the familiar ana- 
lytic-geometry formula expressing the scalar product of vectors 
in terms of their components). 

The number of columns of matrix A must be equal to the number 
of rows of matrix B, otherwise the product AB is not defined. The 
number of rows and columns in the product may be expressed ac¬ 
cording to the following scheme: 

ipXm)- (mXn) = (pXn) 

The two products AB and BA are simultaneously defined if and 
only if A and B are square matrices of the same order. 

Remark /. Of course matrix products could have been defined 
directly with the aid of formula (7) and without taking Into 
account its origination from linear transformations. 

Remark 2. (ienerally speaking, matrix multiplication is not com- 
nuitative, as is readily seen from some examples. Let 




1 0 
0 0 




0 I 
0 0 
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Then 


0 0 
0 0 


AB = 


0 1 


0 0 

^BA = 


4. Let us write the sets of variables x,-. yj, Zu in the form of 
column matrices: 


Xi 


y\ 


2| 

Xn 

, Y = 

Urn 

, z = 

Zp 


Then the formulas for transforming variables (1), (2) and (3) 
may be written in the form of matrix equations: 

Y = BX, Z = AY, Z = CX 


where C = AB. 

5. The following are a number of identities expressing the pro¬ 
perties of matrix multiplication: 

(1) A{BC) = (AB)C (associativity). By this property, the pro¬ 
duct of three matrices ABC can be written without parentheses. 

(2) (aA)B = A(aB) = aAB\ 

(3) A[B->rC) = AB-\- AC\ {BA-C)A = BAA-CA. 

Here, a is an arbitrary scalar, A, B, C are arbitrary matrices in 
which the number of columns and rows ensure the performance of 
the foregoing operations. 

The proof of identities (1), (2), and (3) is elementary and we 
do not give it here. 

6 . There is one more matrix operation that will be used in the 
sequel. It is called transposition or taking the transpose of a 
matrix. We denote it by A*, which is the matrix obtained from 
matrix A by replacing the rows by the corresponding (as to num¬ 
ber) columns. The following obvious identities occur: 

(4) {A + Br = A* + B*-. 

(5) {aAf = aA\ 

And also 

(6) {ABY = B*A\ 

The validity of the last identity is quite evident if we take into 
account the fact that the product of A by 6 is constructed as 
follows: row A into column B. 
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§ 3. Square matrices and nonsingular transformations 

I. In this section we will discuss in more detail the linear 
transformations under which the number of variables is preserved. 
These transformations are associated with square matrices. 


2 . It will be recalled that the determinant of an n by n matrix 
A = lloijil is the quantity 

detyl= Z , ••• (1) 

‘i. 'n 

where 

' 0 if there are identical numbers among the 

numbers i,, to, .... 

'i ‘2 - hi S _|_ 1 if the permutation (r'l, 4, .... /„) is even; 

^ — 1 if the permutation (t,, I 2 . •••! 4) is odd 


The indices ii, 4. in take on the values 1,2, ..., n. 

In (1), the second indices of the elements of matrix A are taken 
in their natural order. For any other arrangement of them and 
also in the case of repetitions, we have 


E , 6.-. ••• «'Vn = ^/.-4det24 


( 2 ) 


We will now prove a theorem that will come in handy later on. 
Theorem. The determinant of a product of matrices is equal to 
the product of their determinants: 

dtt AB^ dei A detB 

Proof. Using (1), (2) and formula (6) of the preceding section 
we get 

det/lB = detC = ES4-. <„c,,i ... c/^„ 



= E (det /4j6/j... ijjjp ... bj^n = det 4 • det B 


which is what we set out to prove. 


3. A scjuare matrix is said to be nonsingular (that is, invertible) 
if ils determinant is not equal to zero. The linear transformation 
of variables .V|, .... x„ into the variables y\, ..., pn is said to be 
nonsingular if it has a nonsingular square matrix. 
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By Subsection 2, the product of nonsingular matrices is a non¬ 
singular matrix. The product of nonsingular linear transformations 
of variables is a nonsingular linear transformation of variables. 

A nonsingular n X « matrix A has rank r = n. This is clear 
since the determinant of such a matrix is the basis minor of the 
matrix. A singular matrix has rank r <C n. This is also clear since 
the determinant of a singular matrix is equal to zero and, hence, 
the basis minors of the matrix have order less than n (or they are 
absent, but then r = 0). It is the reduction in the rank of the 
matrix (compared with the standard case of r = n) that makes 
for singularity. 

4. Suppose we have a nonsingular linear transformation of 
variables with matrix A = ||afj|| 


f/i =anV| + . 


yn = an\Xi-A- . 

• “1“ ^nn^n ) 


We introduce the notation 


A = det A 

By hypothesis, A 0. In that case, system (3) with arbitrary given 

. .. and unknown Xi, ..., has a unique solution. Let us 

find it. It is convenient to use the Cramer formulas. By these for¬ 
mulas, Xk is expressed in the form of a fraction whose denominator 
is A and whose numerator is obtained by replacing the ^th column 
of A by a column made up of the quantities i/i, ..., i/„. Expanding 
the determinant in the numerator of such a fraction, we get 

Xk = ^{A\ky\-\- ••• + A,ti/„) 

where A\k, ..., Anh denote the cofactors of the elements of the feth 
column of the determinant of matrix A. Thus 

X, 


The equations (4), which express .v'l,.... x„ in terms of i/i,..., 
from formula (3), constitute a linear transformation of variables; 
it is the inverse of the linear transformation (3), where yi, 
are expressed in terms of X|, .. ., ,v„. 


-^11 


U] + 




■yi + 




(4) 
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The matrix of transformation (4) is said to he the inverse of 
the (nonsingular) matrix A of the transformation (3). It is denoted 
by A'K Thus, if 


then 


A = 


an .. 

a„i .. 

A-' = 


• ^\n 

, A==( 

• ^nn 


^11 

■4nl 

A 

* ‘ ■ A 

^ Irt 

^nn 

A 

• • • A 


Note that in /I ' we find, located in the rows, the cofactors of 
those elements of A which in A are located in the corresponding 
(as to number label) columns. 


5. Since equations (4) express the solution of system (3) with 
respect to the unknowns Xi, ..., x„, substitution of the expressions 
(4) into the right-hand members of (3) should give the following 
result: 


'/i=t/i. 
tji = tji, , 

tjn= Un , 


(5) 


This is called an indentity transformation. Its matrix is denoted 
by the letter I or E and is called the unit matrix (or identity 
matrix): 


1 0 ... 0 

0 1 ... 0 


0 0 ... 1 


Since (he product of the transformation (3) by the inverse trans¬ 
formation (4) is an identity transformation, (5), correspondingly 
the product of the matrix A by the inverse matrix ^4-' is a unit 
matrix, E\ 


AA-' = E 


( 6 ) 


6 . By Subsection 2, we have from (6) 

(det yl)-(detA~')=l 

whence det A ' 7 ^ 0, and the matrix A-' is nonsingular. Then the 
solution of system (4) for y\, ..., i/„ is unique and, hence, coin- 
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cides with the expressions (3). For this reason, substitution of (3) 
into the right members of (4) must yield the identity transforma¬ 
tion 

X| =j:|, 

X., ~ X-y. 





At the same time we get 

7. To summarize, every nonsingular linear transformation of 
the variables xi, ..., x„ into yi, ..., y„ has a unique inverse linear 
transformation of the variables yi, .... into X\, ..., x„. It too 
is nonsingular and its inverse is the original transformation. 
Every nonsingular matrix has a unique inverse, which is also non¬ 
singular and the inverse of which is the original matrix. 

8. The notions of an inverse transformation and an inverse 
matrix may be defined in a somewhat different manner. Namely, 
suppose we have two linear transformations of variables, which 
we write compactly as 

yi = Z OiiXi (7) 

with the n X matrix A = i|a,j|| and 

Xi='Z (8) 

with the n X rt matrix B = || bj,, II. We can then speak of the trans¬ 
formation (8) as being the inverse of (7) if the product of (7) by 
(8) is the identity transformation 

yi = Zi (i=l,...,n) (9) 

Correspondingly, the matrix B may be called the inverse of A if 

AB = E (10) 

The nonsingularity of the transformation (7) and the matrix A 
need not be specially stipulated, for it unavoidably follows from 
(10), because, due to (10), (det/1) • (detB)= 1, and therefore 
det A ^ 0. But provided det A ^ 0, the system (7) with unknowns 
Xi, ..., Xn and knowns y\, ..., i/„ has a unique solution. Hence, 
the expressions (8), with account taken of (9), that is, with ac¬ 
count of the fact that Zk = yk, must coincide with the expressions 
(4). This brings us back to our original definition of an inverse 
transformation and an inverse matrix. 
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9. If A and B arc nonsingular, tlien we have the identity 

(ABr' = B-'A^' 

Proof. Let B-'A-' = C, then 

(AB) C = (AB) (B-'A~') = A{bB-') = AEA~' = (AE) A~' 

= AA^' = E 

Hence, C = (/ifl)“' by Subsection 8, (10). 

10. In the preceding subsection we took advantage of the fact 
that the following readily verifiable identity holds true for any 
rt X « matrix A: AE = A (and also EA = A), where E is the unit 
matrix. We can say that the matrix E plays the same role in matrix 
multiplication as does the number unity in the multiplication of 
ordinary numbers. 

Remark. It is easy to prove that there is no other matrix with 
that property. What is more, if for any one nonsingular matrix A 
the equation AEi = A holds, then E\ = E. Indeed, 

£, = ££, = E^ = A~' (AEA = A~'A = E 

11. Together with the multiplication of matrices are defined the 
natural powers of a matrix: A^ = AA, A^ = AAA, and so on. Thus 
is defined the notion of a polynomial of a matrix: 

P {A) = a\A' '+ ... a„_].4 + an£ (II) 

where .a„ are scalars and P{A) is an n X « matrix. 

It is a general agreement that any n'X n matrix A to the power 
of zero is equal to the unit n X « matrix E: 

A^ = E 

Therefore, the term a„£ in (11) plays the part of a constant term. 

§ 4. The rank of a product of matrices 

I. Given an my, n matrix A = ||a,i||, an nyp matrix B — 
= llfeiill and an myp matrix C= ||Cij|| equal to their product: 
C = AB. The following theorem holds. 

Theorem I. The rank of a product of matrices does not exceed 
the rank of any one of the factors, that is, 

rank/IS < rank y4, (1) 

rank AB < rank B (2) 

Proof. Wo establish inequality (2), but first let us note two 
trivial eases. 
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(1) If rank B — 0, then matrix 6 is a zero matrix. In that case 
C = AB is a zero matrix too, rank C = 0 and (2) holds. 

(2) If rank B = p (where p is the number of columns in 
matrix B), then (2) also holds, since rank C does not exceed the 
number of columns p in the matrix C. 

Suppose rank B = r and assume that 0 <Z r <. p. 

On this assumption, matrix B has a certain system of r basis 
columns and at least one more column not belonging to that 
system. For the sake of definiteness, suppose the first r columns 
of B are basis columns. We consider any column k, k > r. By the 
basis-minor lemma, it is expressed linearly in terms of the basis 
columns: 


bik 





^Ir 

^nk 

II 

o 

bn\ 

+ 

• + Or 

• • • 


In abbreviated notation, this expression becomes 

r 

^l't~ ^s^ls ( 3 ) 

s=l 

where / = 1, 2, ..., n and the index k is fixed. 

On the other hand, by the definition of a product of matrices we 
have 

n 

<^ik ~ S Oiibjk (4) 

From (3) and (4) 

Cik = Z «I7 { Z = Z Z = Z) (isCis (5) 

where 1 ^ ^ m, the index k is fixed and has the same numerical 
value as in the formula (3). Equations (5) show that in the 
matrix C any column with A: > r can be linearly expressed in 
terms of the first r columns: 


C\k 

= «l 

^11 

+ . 

• + Of 

C|r 


1 

Cmi 



^mr 


whence and by virtue of Lemma 3 of Section 6, Chapter 1, the 
rank of the system of all columns of matrix C does not exceed 
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r tlial is, rank C ^ r, and the proof of inequality (2) is comp¬ 
lete. 

In order to prove inequality (!) it is now sufficient to pass to 
the transposes. Indeed, by Property 6 of Subsection 6, Section 2, 

= = ( 6 ) 

From (6) and due to the established inequality for the second 
factor we have 

rank C — rank C* < rank A* = rank A 

This completes the proof of the theorem. 

2. To avoid any misunderstanding, it will be well to make the 
following warning with respect to the proof of inequality (2). As 
has been demonstrated, any column number k in matrix C, k > r, 
can be linearly expressed in terms of the first k columns. We do 
not know whether these columns are independent or not. There¬ 
fore we cannot assert that rank C — r. Simple numerical examples 
will make it clear that such an assertion is not only unfounded 
but actually erroneous. 

3. Given; an arbitrary m X « matrix A, a nonsingular square 
nX n matrix B and a nonsingular square my, m matrix B'. 

The following theorem holds. 

Theorem 2. The ranks of the products AB and B'A are equal to 
the rank of matrix A, that is, 

rank AS = rank A if di!t6=7fe0 (7) 

and 

rank S'A = rank A if det 6'0 (8) 

In other words, the rank is preserved under multiplication [on 
the left or on the right) of the given matrix by a nonsingular 
matrix. 

Proof. Let C = A-B. Then rank C ^ rank A by Theorem I. But 
due to the nonsingularity of 6 we have A = ASS'* = CS-' so 
that rank A rank C by Theorem I. This completes the proof of 
(7). Equation (8) is proved in similar fashion. 


§ 5. Transformation of coordinates in a change of basis 

1. In many cases one finds it necessary to make a change of 
basis. We will now derive formulas according to which the coor¬ 
dinates of an arbitrary vector are transformed when passing to a 
new basis. 
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2. Let L„ be an n-dimensional linear space and let e\, e„ 
be its basis. Lete^, e'n be the new basis in L„. We expand 
each of the vectors e\, ..e'n relative to the old basis to get 

e'\. = PiiCi -}- P|2e2 + . . . 4 " P\nfint 

e'i — Pnei -|- P22^2 + . . . -f Pin^n, 

s'n = Pnl^l 4' Pn2S2 4" ... 4* Pan^n 

The coefficients of these expansions constitute the square ny^n 
matrix 

^11 • • • P\n 

p= . 

p p 

' /I! • • • * nn 

3. The ith row of matrix P is formed by coordinates of the vector 

e'i in the original basis. Since the new basis vectors e'\ . e'n are 

independent, the matrix P is nonsingular, that is, 

detP=5-fcO (1) 

On the other hand, if we arbitrarily take matrix P under con¬ 
dition (1), then the vectors e'l, ..., e'n defined by the equations 
(I) will be independent and, hence, will constitute a basis. Thus, 
the transition to any new basis is determined by specification of 
the matrix P under the sole condition that it be nonsingular. 

4. Let X be an arbitrary vector in L„. Let us expand it in terms 
of the old basis and the new basis: 

x= X, Xjej, XT = ^ x'le'i 

I i 

We can write the formulas (I) compactly as 

fl 

e'i=YiPiiei, i=l,...,n (I) 

/=! 

whence and from the second expansion of x we have 
^ = Z ( S Pi/ei^ ^ 

Comparing this expansion with the first, we find 

n 

X]='LPilx'i, /=1. n (II) 

1=1 



3* 
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or, written out in full, 

X\ — P\[K\ + Pi\Xo ... -1- Pn\X'n, 

Xz = P\2x'\ + PzzX^ + • • • + Pnzx'n, 

Xn = PinX't + PznX'z + . . . + PnnX'n 

Formulas (II) express the old coordinates . . . of the 

vector X in terms of its new coordinates x'l, ..., x'n and constitute 
a linear transformation of the variables xi,..., x',, into the vari¬ 
ables . x„. The matrix of this transformation is P*, which 

is the transpose of matrix P. Therefore, denoting by X and X' the 
column matrices consisting of the old and new coordinates of the 
vector x, we can write formulas (II) as a matrix equation: 

A- = PV (I la) 

The transformation (II) is nonsingular since 

d2tP* = ditP =5>fc0 

Thus, any change in the basis is associated with a nonsingular 
linear transformation of the coordinates of every vector. 

5. The new coordinates of a vector are expressed in terms of the 
old coordinates by means of a linear transformation that is in¬ 
verse to the transformation (II). 

We will write it in the form 

= Qll-Vl -+-Q12X2 -f- ... -FQirtAfn. 

Xz = QilA:! + Q22A:2 + . • - + QznXn, 

x'n = Q iiACi -f QnZXz "F . • • “F QnnXn 

or, abridged, 

x'i=tQ,ix„ i=\,...,n (III) 

The transformation matrix (III), which we denote by Q, is the 
inverse of P*: 

Qii • • • Qirt 

Q= . =(P‘)“' (2) 

Qn^ Qnn 

Note that each one of the elements of Q is equal to the cofactor 
of the element of matrix P having the same row and column 
iiuiiilier labels divided by det P. As in (Ila), we can write 

X' = QX 




(Ilia) 
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6. The equation (2) relating matrices P and Q may be rewritten 
in two other equivalent modes: 

P'Q = E, QP* = E (3) 

It will be useful to rewrite equations (3) in a form that makes 
use of the so-called Kronecker delta 6,j to denote elements of the 
unit matrix E. By definition, 

r 0 if i^j, 

I 1 if i = / 

In this way, the two matrix equations (3) are replaced, ac¬ 
cordingly, by two systems of numerical equations: 

llPnQtf^^l, ZQuPii = f>^, (4) 

In the former, the ith row of P* is multiplied by the /th column 
of Q, in the latter, the kth row of Q is multiplied by the /th co¬ 
lumn of the matrix P*. 

7. In conclusion note that any nonsingular linear transforma¬ 
tion of the variables JC|, ..., into the variables x'\, . .., x'n may 
be regarded as a transformation of the coordinates of vectors in 
an n-dimensional linear space. Indeed, if we are given the trans¬ 
formation (III), then we know the matrix Q (det Q^O). From 
this we find P* — Q*' and P =(P*)*. If we know the matrix P, 
we obtain the corresponding basis e'l, e'n from the formu¬ 
las (I). 
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§ 1. Affine space ' ' 

1. In linear space, the elements are vectors regarded as entities 
involved in linear operations. However, in many problems atten¬ 
tion is focussed on geometric facts associated with the mutual 
positions of figures (subsets) in the space at hand, with the linear 
operations of secondary interest. 

It is for this reason that along with a linear space we intro¬ 
duce the concept of an affine space whose elements are points. The 
points of affine space are related in a definite way to the vectors 
of linear space (in much the way as is done in elementary analytic 
geometry). The conditions for these relationships are given in the 
next subsection together with a definition of an affine space. 

2. Suppose we have a certain set 91 whose elements will be 

called points denoted by capital letters A, B .M, .... Also 

given is a certain linear space L. Now let every vector in L be 
associated with an ordered pair of points in 9J. If the pair of 
points A, B is associated with the vector x, we write x = AB. 
Here, the symbol AB is merely a different notation for the vector x. 

The first of the two points is the origin (tail) of the vector AB, 
the second point is the terminus (tip). 

Definition. A set 91 associated with a linear space L is said to 
be an affine space if the following two axioms hold true. 

(1) I'or every point A in 91 and for every vector x in L there is 
a unique point B in 91 such that AB = x. 

(2) If AB -- x, BC = y, then AC — x y (Fig. 5). 

An .affine space is said to be real or complex, finite-dimensional 
or infinite-dimensional, if the corresponding linear space L is res- 
p('cliv('ly real or complex, finite-dimensional or infinite-dimensional. 
The dimension of .an affine space 91 is the number equal to the 
dimension of the linear space L. 

Rentarh I. F.vcay linear space L may be regarded as an affine 
space 91. It suffices merely to call the vectors points and to asso- 
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ciate the vector b — a ^ L with each pair of vectors a, b consi¬ 
dered as points of the set ‘Jl. 

Remark 2. Every affine space 91 may be regarded as a linear 
space. It suffices merely to specify some point 0 in the space 91. 
Then with any arbitrary point A1 e 91 is associated its radius 
vector OM. The set of radius vectors of all points of the space 91 
is what constitutes the space L 


B 



Remark 3. In the future, we will always indicate which space, 
affine or linear, we are dealing with. However, it is possible to 
agree to consider an affine space with an indicated point in it. 
Then we will have at our disposal both points and vectors. 

3. Note the following two elementary properties of an affine 
space. 

Theorem I. Associated with every pair of coincident points in 91 
is the zero vector of L. 

Proof. Assume that A is an arbitrary point and that a vector z 
is associated with the pair of points AA. Let x be an arbitrary 
vector in L. Then by the first axiom of an affine space there is a 
point B such that AB = x. Applying the second axiom, we get 

x + 2 — z + x — AA + AB = AB = x 

hence, 2 = 9. _ _ 

Theorem 2. If AB = x, then BA = —x. 

Proof. Let BA = y. Then 

xA-y = AB-\-^ = M^i'i 

whence y = — x. 

§ 2. Affine coordinates 

1. We will assume that the given affine space 91 is n-dimen- 
sional, and will introduce a so-called affine system of coordinates. 

To do this, in 91 we choose an arbitrary point 0, called the 
origin, and in the appropriate linear space L we take a basis 
ei .e„. Let M be an arbitrary point in 91. Together with the 
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coordinate origin, it defines a vector OM e L called the radius 
vector of the point M. Expanding the radius vector OM in terms of 
the basis C|, ..., e„, we get 

OM = x,e, +-^ 2 ^ 2 + • • • + x„en 

The coefficients of this expansion, Xi, ..., x„, are called the affine 
coordinates of the point M (referred to the chosen system with 
origin O and basis C|, ..., e,,)- Note that an affine system of coor¬ 
dinates is given by two unlike entities; the point 0 in affine space 
and the basis ei, ..., e„ in linear space. 

The coordinates of every point M are defined uniquely due to 
the uniqueness of the expansion of the vector OM in terms of the 
basis ei, ..., e„. 

2. Let there be given another arbitrary point N with coordi¬ 
nates t/i, ..., t/„. We will show how the coordinates of the vector 
MN are expressed in terms of the affine coordinates of the points M 
and N. Taking advantage of Axiom 2 and Theorem 2 of Section 1, 
we find 

MN = MO + ON = ON— OM = (!/,—x,)e,+ ... — x„)e„ 

so that the vector MN has the coordinates t/i —X|, ..., i/,,— Xn- 
In other words, to obtain the coordinates of the vector MN, 
subtract the coordinates of the origin from the coordinates of the 
terminus of the vector. 

3. Retaining the chosen basis, we translate the coordinate origin 

from 0 to Oi. We denote by ai.a„ the coordinates of Oi in 

the original system and will assume them to be known. We then 
find the new affine coordinates X|, ..., x„ of the arbitrary point M. 
(We denote the old coordinates of the point M by Xi, ..., Xn.) We 
have the vector equation 

or, what is the same thing, 

X|C, + ... -f-x„e„ = a,c,-f- ... -f a„e„ + X|C, + ... + x„e„ 

whence, due to the uniqueness of the vector expansion in terms of 
the basis, we find 

xi=xi + ai, /= 1, ..., n 

4. If the coordinate origin remains fixed and the basis undergoes 
change, then the affine coordinates of the points are transformed 
in the same way as the coordinates of their radius vectors, that is, 
by the formulas of Section 5, Chapter II. 
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5. Now suppose we pass from the given affine coordinate system 
with origin 0 and basis ei, ..., e„ to a new system with origin O' 
and basis e'\, e'n. Here, we assume as known the coordinates 
of O' in the old system (Oi, ..., a„) and also the vector expan¬ 
sions of the new basis in terms of the old basis: 

e'l = 11 Pi 1^1 

Using the results of the two preceding subsections and of Sec¬ 
tion 5, Chapter II, we get formulas that express the old coordi¬ 
nates xi, ..., x„ of an arbitrary point M in terms of its new coor¬ 
dinates x'\, . . ., x'n, 

Xi=ll Piix'i + ai, (I) 

Besides formulas (I), we have the inverse formulas 

x'i=ll Oil (x, — a/) =11 Qiixi -f a'l 

where Q=(P*)-' (see Section 5, Chapter II), a'\= — HQiiO/, 
j = I, ..., n. 

Transformations of affine coordinates will be used frequently in 
the sequel. 

§ 3. Planes 

I. Suppose in n-dimensional affine space 51,, we have a fixed 
arbitrary point A and, in the corresponding linear space L„, a fixed 
arbitrary r-dimensional subspace Lr. 



Definition. The set of all points M of an affine space such that 
i4Al e Lr is called an r-dimensional plane passing through the 
point A in the direction of the subspace Lr (see Fig. 6, where 
r = 2). 

We also say that Lr is the direction subspace of this plane. It is 
obvious that every plane uniquely defines its direction subspace. 

Point M is called the running point of the plane. Figure 6 
depicts three positions, M|, M 2 , Mz, of the running point M. 

2. Particular cases. (1) If r = 0, then the plane consists of the 
single point A. Therefore, every point of an affine space may be 
regarded as a zero-dimensional plane. 
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(2) A onc-dimensional plane is called a straight line. 

^3) A plane of dimension n — 1 is termed a hyperplane. 

(4) When r = n the plane coincides with the entire space 2ln- 

3. In the definition of a plane we isolated a point A. We will 
prove that in reality all points of the plane are of an equal status. 

Denote the plane by Pr and take a fixed arbitrary point B in Pr. 
We have to prove that a point M belongs to the plane Pr if and 
only if BM e Lr (that is to say, that any point B can play the 
role of point A). 



Let BM ^ Lr (Fig. 7). By the definition of a plane, AB ^ Lr, 
whence and hy the definition of a subspace, AM = AB + BM e L^ 
Therefore, M e P,. Conversely, if M e Pr, then AM e Li, conse¬ 
quently BM — AM — AB e Lr. 

4. Theorem. Every r-ditnensional plane in affine space is itself 
an r-diniensional affine space. 

Proof. Given an affine space 91 to which corresponds a linear 
space L. Let Pr be a plane passing through a point A in the di¬ 
rection of the subspace Lr. In the plane Pr take two arbitrary 
points M, N. By the definition of an affine space, they are asso¬ 
ciated with the vector MN e L. By the definition of a plane, the 
vectors AM and AN belong to the subspace Lr. Hence, 

MN = AN-AMeLr 

Thus, to every ordered pair of points M, N of plane Pr is asso¬ 
ciated a vector MN in the r-dimensional linear space L,.. Here, the 
observance, for P,, of the first of the axioms of Subsection 2, Sec¬ 
tion 1, follows from the definition of an r-dimensional plane; the 
second of the axioms of Subsection 2, Section 1, holds true for Pr 
because it holds for the entire affine space 91. This completes the 
proof of the theorem. 

Remark. If the plane passes through the origin of an affine 
system of coordinates in the direction of the subspace L,, then the 
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aggregate of the radius vectors of its points forms a linear space, 
which, by definition, coincides with the subspace Lr. 

5. In the affine space 9t let there be given r + 1 points 
A(j,A\, ..., Ar. We say that these /70(«/s are in the general posi¬ 
tion if they do not belong to the (r — 1)-dimensional plane alone. 

The reader will have no difficulty in verifying that the points 
Ao, A,, . ■., Ar ar e in the general position if and only if the 
vectors AqAi, ..., AoAr are linearly independent (Fig. 8). Note 
that it is immaterial which of the points is taken as (that is, 
as the origin of the vectors issuing from it to other points). 


Ao 



From the foregoing and from the definition of a plane it follows 
that an r-dimensional plane, and only one such plane, passes 
through the system of points Ao, Ai, ..., Ar lying in the general 
position. 

6. Suppose in the space 91 „ we have a fixed affine system of 
coordinates with origin 0 and basis Ci, ..., e„. Let us consider a 
plane Pr passing through point A in the direction of subspace Lr- 

We assume that A has coordinates pu ..., Pn and that Lr is 
given as a linear hull of the independent system of vectors 
Pi, ..., Pr (see Chapter 1, Section 13, Subsection 5). Then the 
radius vector OM of the running point of the plane may be written 
as follows: 

OM = OA -|- AM ^ OA ~(“ T|<7i -|- • • • TrPr (1) 

where the parameters xi, ..., Xr independently run through all 
possible numerical values, and the vector OA = piC) -f ... -f- Pn^n 
(Fig. 9). 

Resolve the vectors p\, pr in terms of the basis e\, ..., 

Pi — + QI2<^2 + • • • + Qpfin 

As usual, we denote the coordinates of the running point M by 
.Xi. Xn and write down the vector equation (1) in coordinate.^ 
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(o olitain n numerical equations 

•*■1 = ^llTl + 721^2 +...-+ qr -^r + Pu \ 

. ( 2 ) 

Xn = q^n■^^-\■qln^^2+ ••• +qrn^r + Pn ) 

These equations are called the parametric equations of the plane Pr 
(in the given system of coordinates). Note that all the equations 
of the system (2) are linear in the coordinates of the running point 
and in the parameters Xj. 

The converse is also true: the locus of points defined by the 
equations (2) for all values of Xj is a plane that passes through 



the point A in the direction of the subspace Lr = L(qi, .... qr). 
Indeed, the equations (2) are equivalent to the vector equation (1), 
which means that the vector AM e Lr. 

If for the system (2) we write the corresponding homogeneous 
system (that is, if we replace pi, ..., Pn by zeros), we get the pa¬ 
rametric equations of the direction subspace of the plane Pr. 

7. Example. The space studied in solid geometry is a three-di¬ 
mensional affine space, in which the one-dimensional and two-di¬ 
mensional planes coincide respectively with the straight lines and 
the planes of elementary geometry. This can readily be proved in 
a variety of ways, for instance, by taking advantage of the results 
of the preceding subsection and the parametric equations of a 
straiglit line and a plane as given in elementary analytic geo¬ 
metry. 

8. Important remark. Metric concepts, such as distances between 
points, lengths of lines, areas and volumes of figures, angles and 
perpendicularity, are defined in the space studied in elementary 
geometry, but are not defined in affine space. In affine space, one 
investigates only those geometric properties of figures that do not 




SYSTEMS OF FIRST-DEGREE EQUATIONS 


77 


§ 4) 

depend on metric notions. Nevertheless, such investigations are 
substantive and permit solving many problems. 

9. Before proceeding to the study of planes (and also other 
figures) in affine space, we give in the next few sections the ne¬ 
cessary basic algebraic apparatus: systems of first-degree equa¬ 
tions. 

§ 4. Systems of first-degree equations 

1. Let us consider the following system of equations: 

a,,A:, + 0 , 2 X 2 + ... +a,„x„ =^,.] 

OjlX, +022X2 + ... +02„X„ =62. ^ (I) 

+ ^m2^2 + • • . + ~ ' 

The letters o,,, ..., o,„„, 6,, ..., denote given numbers, 
Xi, ..., x„ stand for unknowns. 

The numbers o,j are called the coefficients of the system (1) and 
form an m X matrix: 

o„ ... o,„ 

A= . 

Onti . • . Onin 

\i 4 hich is called the basic matrix of system (1). Henceforth we will 
assume that the matrix A is nonzero, that is, that there are coef¬ 
ficients in the system (1) that differ from zero. 

The numbers 6,, ..., bm are called the constant terms of the 
equations. 

The matrix 

0,1 ... 0 |„ 6 | 

B= . 

^ml • • • ^mn^m 

is called the augmented matrix of the system (1). 

Any ordered n-tuple of numbers x,, ..., x„ whose substitution 
in place of the unknowns makes all the equations of the system 
arithmetic identities is called a solution of the system. A system 
is said to be consistent if it has at least one solution. 

2. Kronecker-Capelli theorem. For system (1) to be consistent, 
it is necessary and sufficient that the rank of its augmented 
matrix be equal to the rank of the basic matrix-. 


rank B = rank A 


(2) 
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Proof. (1) Necessity. It is clear that 

rank J3 ^ rank/I (3) 

Denote by Oi, ..., a„ the columns of matrix A and by b the 
column of constant terms, and then regard all these columns as 
vectors in the coordinate space Km- Let system (1) have the solu¬ 
tion Xi, ..., Xn. This solution converts the equations to a system of 
numerical identities, which may be written down as a single vector 
equation; 

Xifl, 4- X2fl2 + . • • + X„a„ = b (4) 

From (4) it follows that the vectors a\, ..., a„, b are linearly ex¬ 
pressible in terms of the vectors Oi, ..., a„. By Lemma 3 of Sec¬ 
tion 6, Chapter I, and by virtue of the definition of the rank of a 
matrix we have 

rank B < rank A (5) 

From (3) and (5) follows (2). 

(2) Sufficiency. Let (2) hold true. Matrix A is by hypothesis non¬ 
zero, it therefore has a basis minor of order r — rank 4 > 0. 

For the sake of definiteness, let us assume that the first r co¬ 
lumns ai, ..., Or are basis columns. Consider the system of vectors 
Oi, ..., Ur.b. This system is linearly dependent, for otherwise 
rank S = r -|- 1 > r. Therefore, the vector b can be expressed in 
terms of the linearly independent vectors Oi, ..., a, (see Chapter I, 
Section 6, Lemma 2): 

b = Kxa\-\- ... -h (6) 

Set 

X| = X|, . . ., Xr = Xr, ^r+l ^ Xn ^ (7) 

Writing system (1) in the vector form (4) and substituting into 
it the quantities (7), we get the identity (6). Thus, system (I) is 
consistent and therefore has at least one solution (7). The proof 
is complete. 


3. We note the particular case where the number of equations 
is ecjual to the number of unknowns and matrix A (square in this 
case) is nonsingular, that is. 


D = det 4 = 


a„ . 

• ^i/i 

^n\ • 

• Onn 


¥= 0 


Then system (1) has a unique solution which may be found by 
Craiiici’s rule: 


/>!(/') 

L) 


X-y 


P2{b) 


nn(b) 


D 


D 


( 8 ) 
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Here Dj(b) denotes the determinant obtained from D by replacing 
the /th column by the column of constant terms, that is, 

On ... r/|/-i 6| ... a,,, 

D,(b)= (9) 

<I„I . . . Cl„l—\ b,, Qnf + l • • • 0,nn 

Remark 1. If x denotes the vector (aji, ..., Xn) written as a 
column, then system (1) can be represented as the matrix equation 

AK = b (la) 

and Cramer’s formulas (8) as the matrix equation 

x = A~'b (8a) 

The transition from (la) to (8a) is attained by multiplying both 

members on the left by the matrix A-'. 

Remark 2. Formulas (8) are not convenient for practical solving 
of systems with large numbers of equations and unknowns because 
of the difficulties in computing the determinants D and Dj{b). For 
this reason, a variety of oilier methods have been devised for 
solving such systems. They are given in books on computational 
mathematics. 


4. Let us return to system (1) for arbitrary m and n. We assume 
tkat the conditions of consistency (2) hold true. Our aim will be 
to find all the solutions of the system. The number r, which is 
equal to the rank of matrices A and B will be called the rank of 
the system (1). For the sake of definiteness, we will assume that 
the basis minor of matrix A occupies the upper left corner (this 
can always be achieved by renumbering the unknowns and the 
positions of the equations). We denote this minor by D: 


flll . 

■ Qlr 

a,| . 

• flrr 


¥=0 


Z) is a basis minor for the matrix B as well, and so the rows of 
B having numbers r-(- 1, ..., m are linear combinations of the 
first r rows of the matrix (see Chapter I, Section 7). This means 
that equations having numbers r -f- 1, ..., m are linear combina¬ 
tions of the first r equations so that the system (1) is equivalent 
to the system 


a,,AC, + ... -{^au^Xa = b^ j 
a„AC, + . • • + a,^Xn = b, ) 


(10) 
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On the left we leave only those terms whose coefficients form 
the basis minor D, and transpose all other terms to the right; 


fl||-V'| + . . 

• H" r^r — ^1 ^1 r + l-^r + I 

■ • ^\nXn’ 

ar\X\ 4- • 

• • 4" rXf ~ br Of r+l-^r+l • 

• • —araX„ 


(ID- 


We will say that the unknowns Xr+\ . x„ are free, since any 

numerical values can be assigned to them. Then the unknowns 
Xi, Xr are unambiguously determined from system (11) by 
Cramer’s formulas; 


- - 


n nf 


( 12 ) 


As before, the a, here denote the columns of the basic matrix A of 
system (10) and b denotes the column of constant terms of the 
system (10). The symbol Dj is determined by formula (9) in which 
n is replaced by r and the vector b is replaced by the vector 

b Xr+\(lr+\ • • • XjiCln' 

Using the properties of determinants, we expand the numerator 
of (12) to obtain 


^l~ D D ' 

(/■= 1. r) 




D " 


Let us introduce the following notation; 

_ ^/(^) P/(—g/i) 

Pi q • 9'./ D 

i = 1, ..., r, & = r + 1. • • •, « 

Then from (13) we have 

Xl=Pt 4 Qr + I + l 4- • • • 4- 

Vr = Pr4-9r + lr^r+l + ••• + Par' 
We adjoin another n — r obvious equations; 


,\Xn> I 

'„rX„ 3 


Xr+\ — Xr+] 


.:1 


(13) 


(14) 


(15) 


(16) 
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Now substitute a single vector equation for all the equations of 
(15) and (16); 




Pi 


Pr + l 1 


Pfii 



Pr 


Pr + l r 


Qitr 


— 

0 

+ 

1 

Xf+i + • • • + 

0 



0 


0 


0 

Xn 


0 


0 


1 


(17) 


Formula (17) yields a general solution to the system (1) since it 
expresses all the unknowns X], ..., Xr, Xr+i, .... x„ in terms of 

the free unknowns . .. to which we can assign arbitrary 

numerical values. We will show that all solutions of system (1) 
are then exhausted. Indeed, if Xi, ..., x,-, a;,.+i> •••, aTh is any given 
solution of (1), then Xr+\, .... x„ have definite numerical values. 
Substituting them into (1) and repeating the previous manipula¬ 
tions, we get equation (17). 


5. Denote by x the column In the left member of (17), and by p, 
Qr+i, ..., Pn the columns in the right member of that equation in 
tljp order of their arrangement. Then (17) takes the form 

x = p + Xr+tqr+t+ ••• +x„q„ (18) 

Equations (17) and (18) are to be understood as equations 
between the vectors of the coordinate space Kn- 


6. Corollary. // system (1) is consistent and its rank r is less 
than the number of unknowns n, then the system has an infinitude 
of solutions. 

Remark. Generally speaking, the choice of free unknowns may 
be accomplished in different ways. However, not just any collec¬ 
tion n — r of unknowns can be taken as free unknowns. It is re¬ 
quired that the coefficients of the remaining r unknowns in sy¬ 
stem (1) form a basis minor of the matrix A. 


§ 5. Homogeneous systems 


I. The system of equations 

anX\ + • • • + U|„v„ = 0, 

• • • “h ~ 0 


(1) 
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is silid to be homogeneous; here, the right members of all equa¬ 
tions arc equal to zero; 

6 ,= ...= 6 „. = 0 ( 2 ) 

2. A homogeneous system is always consistent. On the one 
hand, this follows from the Kronecker-Capelli theorem: rank 
B = rank A, since the matrix B is obtained from A by adjoining 
a zeroth column. 

On the other hand, it is immediately apparent that system (1) 
has a zero solution: 

x,= ... = = 0 

The zero solution of a homogeneous system is said to be a 
trivial solution. All other solutions are termed nontrivial. 

3. As before, we will consider the solutions of system (!) as 
vectors in the coordinate space Kn- 

Theorem 1. The solution set of a homogeneous system forms in 
the space Kn a subspace of dimension n — r, where r is the rank 
of the system. 

Proof. Due to condition (2), 


Therefore, in the case at hand, p = 6 and formula (18) of Sec¬ 
tion 4 expresses any solution x as a linear combination of the 
vectors p,.+i, ..., q,,. Conversely, any linear combination of vectors 
qr+\, ..., q,, yields a solution of the homogeneous system (1). In 
other words, the set X of all solutions of such a system is a linear 
hull of vectors p,+i, . ., 9n in Kn- Hence, .Y is a linear subspace 
of Kn- 

We now make sure that the vectors pr+i.are linearly 

independent. With this purpose in mind, let us consider the matrix 
r made up of the coordinates of the vectors pr+i. Qn- 

<7r + l I • • • Qn\ 


9r+Ir • • • Qnr 

I ... 0 

0 ... 1 

The lower minor of maximum order of the matrix F is its basis 
minor (it is equal to unity, that is to say, it is different from zero 
and (loos not have any bordering minors). Hence, the columns of F 

aro lino.-irly indopoiulcnt, which means the vectors qr+\ . qn 

arc linearly independent as well. 
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From the foregoing it follows that the vectors ^,+ 1 , . .., qn con¬ 
stitute a basis in X. But the number of vectors is equal to n — r, 
hence X has dimension n — r, and the proof of the theorem is 
complete. 

4. Let there be given a linearly independent system of solutions, 
n — r in all: 

Cll 

C| = • , 

C,n 

Then any solution x of system (1) can be represented in the form 
of a linear combination of the given solutions (3): 

X = r,C, + ... +T„-,C„-r (4) 

Conversely, any linear combination of the form (4) yields a solu¬ 
tion. 

Both assertions follow immediately from the preceding theorem. 
Namely, by this theorem the subspace X of solutions of the sy¬ 
stem (1) has dimension n — r. Hence, the solutions Cj, .... c„_, 
constitute a basis in that subspace. 

Definition. Any linearly independent set of n — r solutions is 
*aid to be fundamental with respect to the system of equations (1). 

Conclusion. To solve a homogeneous system of equations (1), 
it suffices to find some fundamental system of its solutions 

C|. c„-r. Then all solutions of (1) are given by (4), in which 

each of the parameters ti, ..., Xn-r independently runs through 
all possible numerical values. 

Remark. One of the fundamental systems of solutions is made 
up of the columns of matrix F. This solution set is given by for¬ 
mulas (14) of Section 4. 

5. Example. Let us consider the system of equations 

Xi+x., + X3 - x^ = 0, ■) 

X2-.V3 + a:,, = 0 J 

Here n = 4, r = 2, and so the solution space has dimension 

n — r = 2. Hence all we need to do is find some two independent 

solutions, say 

X, = 0, X2 = 0, .V3=l. 

X, = 2 , X 2 = — 1 , X 3 = — 1 , X 4 = 0 




84 


SYSTEMS OE LINEAR EQUATIONS 


(CH. in 


whence we get the general solution to system (5): 


-^1 


0 


2 

.v. 


0 


- 1 


= T| 

1 

+ t:-2 

- 1 




.V 4 


1 


0 


6 . Important particular cases. (1) A homogeneous system of n 
equations in n unknowns: 


a,,x, + . ■ • + = 

“ 1 “ • ' ■ * 1 " ^nn^n ^ 


( 6 ) 


Relative to system ( 6 ) we note only one theorem which is made 
frequent use of in the applications of linear algebra. 

Theorem 2 . A system of type ( 6 ) has a nontrivial solution if 
and only if its determinant is zero: 

D = det||a,/|| = 0 

Indeed, in this case and only in this case is r = rank A <. n 
and the dimensionality of the solution space positive: n — r > 0. 

(2) A homogeneous system of n—1 independent equations in 
n unknowns: 

a,,Ar, + ... +ai„A:„ = 0, 


fln-i i-^i + • • • +a„-f„x„ = 0 



The independence condition of the equations means that the (rec¬ 
tangular) matrix A = ||a,j|| of system (7) has rank r = n — 1. In 
this case the solution space is one-dimensional (n — r = 1 ), and 
to obtain the general solution of system (7) it suffices to find one 
nontrivial solution. This is done as follows. 

Form an auxiliary square matrix A of order n, which is obtained 
from matrix A by adjoining a new row at the top: 



(^01 


A = 

0(1 

. . Ojfi 


^n-l 1 

•• ^n—]n 


where floi, Oo„ are arbitrary numbers. Denote by Aoj the co- 
faclors of the elements aoj in matrix A. Then the quantities 

.V| = /Iji, Xi = Aj 2 , . • ., x„ = Ajn (8) 

form a solution of system (7). Substitute the quantities ( 8 ) into 
the ilh c(|uatioii to obtain the sum of the products of the elements 
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of one row of malrix A into the cofactors of the elements of ano¬ 
ther row, which sum is known to be equal to zero: 

Thus, the numbers (8) satisfy the system (7). 

Denote by M, the minor of matrix A of order n — 1 obtained by 
striking out the /th column: 



flu 

... ^1/-I 

^1 / + ! 

• ^In 

Mi = 

021 

... 02 /-I 

^>/ + l 

. a2n 


^n-l 1 

• • • ^n-l /-I 


•• ^n-ln 


Then Aoj = { —l)^+'Afj. Among the minors Mj there is at least one 
that is nonzero (the basis minor of matrix A). For this reason, 
solution (8) is nontrivial. The general solution of system (7) may 
be written thus: 

^/ = (-l/+'Af/T 

where t is an arbitrary number. In other words, the solution of 
system (7) is proportional to the minors of maximum order of the 
matrix A taken with alternating signs. This is sometimes written 
as follows: 

Xy:x2'-X3\ ... =Af|: (—Ay ; Afj: ... 

T. Earlier, in Subsection 3, we demonstrated that a homogeneous 
system of first-degree equations of rank r determines in an n-di- 
mensional linear space L,, a subspace of dimension n — r. We will 
now show that the converse is also true, namely that the following 
theorem is valid. 

Theorem 3. Any subspace of dimension k in a space with a 
given basis is a solution subspace of some homogeneous system 
of linear equations of rank n — k. 

Proof. Let there be given in L„ a basis eu ..., e„ and a sub¬ 
space Z,/,. In this subspace take k independent vectors denoted by 
e'n-k+i, .... e'n. Using the lemma of Subsection 7, Section 14, 
Chapter 1, complete them to form a basis in L„: 

e\f ..., en—ky en—ly ..., en (9) 

The subspace Lh is a linear hull of the vectors ^,',-*+ 1 , ..., ef,. 
Therefore the vector x in L„ lies in Li, if and only if in the basis (9) 
the coordinates with numbers U ... y n — k are zero: 

x'i = 0 (i = \ y .. .y n — k) (10) 

Formulas (10) constitute a system of equations that determine Lh 
in the basis (9). Now let us pass to the original basis <?i, .... 
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To do this we take advantage of the formulas for transforming 
the coordinates (see Chapter II, Section 5): 

n 

Whence we obtain the desired system of equations (11), which is 
equivalent to the system (10): 

n 

SQ<7>f/ = 0. — k (11) 

Since the nX^ matrix Q = ||Q,j|| is nonsingular, all its rows 
are linearly independent. Hence, the rank of system (11) is equal 
to the number of its equations: r = n — k. The proof of Theorem 3 
is complete. 

8. As will be seen from the foregoing proof, the specific notation 
of a system of equations defining Lh depends on the choice of 
basis. 

Also, a given subspace may be specified in a given basis by 
distinct homogeneous systems of equations. This is clear since for 
system (11) there are an infinity of other equivalent systems. We 
now show how these systems can be constructed. Let 

All ... hi n-k 

H=\\hn\\^ . 

^n—k I ••• ^n—kn~k 

be any nonsingular square matrix of order n — k. Fix /, multiply 
equations (11) respectively by the numbers hn {i=l,...,n — k), 
and add them. Then write down the resulting relations taking 
/ = 1, ..., n — A to get the homogeneous system 

n — k n 

T.huT.QiiXi = 0 (/=1, A) (12) 

/=! 

By introducing the numbers 

n ^ k 

Ril=ZfiuQil (1=1, n-k] i=l . n) (13) 

;=i 

we can write system (12) more simply: 

ft 

ERi,Xi = 0 (1=1, n-k) (14) 

Syslem (14) is clearly a corollary to system (11). We will now 
show that in tiim system (11) is a consequence of system (14). 
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For each fixed /(I ^ ^ n), formula (13) may be regarded as a 
system of equations with unknowns Q,j- and right members Rij. 
Solving this system by Cramer’s rule for each /, we find that 

Qil= Z M;aRal (15) 

n=l 

where the matrix ll/Ziall = H-' is the inverse o\ H — ||/i/il|. 

Formula (15) shows that the coefficients of system (11) are ex¬ 
pressed in terms of the coefficients of system (14) with the aid of 
the matrix ||//,-^|| just as the coefficients of (14) were expressed in 
terms of (11) with the aid of the matrix Thus, the systems 

(11) and (14) are equivalent, and from the given system (11) we 
can obtain an infinity of equivalent systems of the form (12) for 
the reason that there are an infinity of ways of choosing the non¬ 
singular matrix H. 

Remark. If the rectangular matrices of systems of equations 
(11) and (14) are denoted by Q and R, respectively, two systems 
of equations (13) and (15) may be replaced by the two matrix 
equations 

R = HQ. Q = H~'R 

whence it follows that rank R = rank Q. In other words, all 
systems of the form (14) that we construct via the given system 
(11) have the same rank, n — k. 

« Example. The system 

+ x:2 — V3 — X4 == 0, ■) 

Xt—X2-\-X3 + X^ = 0 ) 

defines in four-dimensional linear space a certain two-dimensional 
subspace L 2 . Taking 

I > I 
1 -1 

we obtain for (14) the system 

Xi = 0, x'2 — X3 — X4 = 0 

which defines the same subspace L 2 as the given system (in other 
words, we replace the given equations by their half-sum and half¬ 
difference). 

9. We conclude this section with proof that a linear nonsingular 
transformation of variables retains the rank of the system of equa¬ 
tions (1). Write (1) in the form of a matrix equation: 

/ l ^==0 


(la) 
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where A — |!a,j|| is an m X « matrix of coefficients, X is the co¬ 
lumn matrix of tlie unknowns Xi, ..., x,,, and the zero in the right 
memlier denotes the zero m X 1 matrix. Let there be given the 
cliange of varialiles X' = QX (see formulas (Ilia) and (HI), Sec¬ 
tion 5, Chapter 11; Q is a nonsingular n X « matrix). Then 

X = P'X' (16) 


where P = I|Pa/I| =(Q"')*. Substituting (16) into (la), we get the 
following matrix notation for the system of equations under con¬ 
sideration in the new variables xj, ..., x^: 

{AP*) = 0 

or, expanded. 



1 = 1, ..., m 


The matrix P is nonsingular so that 

rank AP* = rank A 


according to Subsection 3, Section 4, Chapter II. This completes 
the proof of the assertion stated at the beginning of this subsec¬ 
tion. 


§ 6. Nonhomogeneous systems 

1. Given a nonhomogeneous system 

n 

^ai,Kj = bi ( 1 ) 

where i = 1, ..., m and among the 6,- there are nonzero numbers. 
Assume that the system is consistent, that is, that rank A = 
= rank B = r. Let {x'*, ..., x^| be a solution of system (1). Sub¬ 
stituting this solution into system (1), we get the identities 

= ^ ( 2 ) 

Substract identities (2) from equations (1) to get system (3), 
which is c(|uivalcnt to system (1): 

|:_a„(x,-x';) = 0 

Put v^~.xy = «^ and we get the homogeneous system 


(3) 
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Suppose that for the system of equations (4) we know the fun¬ 
damental set of solutions 





C\ — 



C\n 

1 

1 


(5) 


Then, by the results of Subsection 4, Section 5, any solution of 
(4) can be expressed in the form of a linear combination of 
vectors (5), that is 


"/ = 1^|Cl/+ ... , (6) 

where t. t„_,. are arbitrary scalars. Since = —xj, 

from (6) we get 

= + ... /=1. n (7) 

Let us call x®, ..., x" a particular solution of system (1). The 

sum in brackets in (7) is the general solution of system (4). 

The system (4) obtained from (1) by replacing the right 

members by zeros is called the homogeneous system correspond¬ 
ing to system (1). 

Formula (7) shows that the following theorem holds true. 

• Theorem \. The general solution of a nonhomogeneous system 
(1) is represented in the form of a sum of an arbitrary particutar 
solution of that system and the general solution of the correspond¬ 
ing homogeneous system. 

2. A geometrical interpretation of the set of solutions of a non¬ 
homogeneous system of linear equations. We consider an n-dimen- 
sional affine space ’il„. In it specify an affine system of coordina¬ 
tes. Then to each solution Xi, ..., x„ of (1) we can associate a 
point of the space with the coordinates X|, ..., x„. The follow¬ 
ing theorem is valid. 

Theorem 2. All solulions of system (1) form in 'it„ a plane of 
dimension n — r. 

Proof. All solutions of system (1) are given by formula (7). 
Because of the independence of the vectors (5), this formula is 
nothing but the parametric equations of a certain plane of dimen¬ 
sion n — r (see Section 3, Subsection 6). The proof of Theorem 2 
is complete. 

Theorem 3. In affine space 91 „ and in any affine coordinates, 
any plane may be specified by a system of linear equations of 
the form (1) and of rank r = n — m. 
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Proof. Let the plane P,„ pass through a point A having coordi¬ 
nates ,y'J, .... -v'J in the direction of the subspace L,„. Transfer the 
origin of the affine system of coordinates to point A preserving 
the original basis. We denote the coordinates of the running point 
M in the original system by Xi, ..., x„ and in the new system by 
.V|, .... Xn. The latter coincide with the coordinates of the vector 
AM e L,n. By Theorem 3, Section 5, the subspace L,„ is given by 
a certain homogeneous system of linear equations of rank r = 
= n — m: 

n 

Y. aijXj = 0, /= 1, ..r 

/=i 

Taking into account that Xj = Xj — xfj, we get 

ta^|{x,-x^) = 0 

Putting bi = Y, we find the system of equations 

n 

^aiiXj = bi, 1=1,..., r (8) 

which is of the same rank r = n — m and defines P,„ in the ori¬ 
ginal coordinates. This completes the proof of Theorem 3. 

3. Corollary. A plane is given by a homogeneous system of 
linear equations if and only if it passes through the coordinate 
origin. 

4. Important special case. A hyperplane is specified by a single 
linear equation; 

a,Xi + a2X2+ ••• +a„.Xn = b 

5. Each one of the equations in (8) can be regarded as the equa¬ 
tion of some hyperplane. For this reason, every plane of dimension 
m may be regarded as the intersection of a certain number m — n 
of hyperplaiies. 

6. If a system of linear equations is inconsistent, then this signi¬ 
fies geometrically that there is not a single point belonging at 
once to all hyperplanes given by the equations of the system. 

7. It is quite obvious that when one passes to new affine coor¬ 
dinates the form of the equations (8) changes. Besides, a given 
plane P,„ in a given system of affine coordinates may be specified 
by distinct systems of equations. This is clear because there is an 
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infinity of other equivalent systems for the system (8). Thus, for 
example, we can take any nonsingular square matrix H = ||/i,j||, 
i, /' = 1, 2, ..., r and write down the corollaries to the system (8): 



Introducing the notation 

r r 

^11 = Z b'i=Y, 

a=l a=l 

we can write equations (9) more simply: 

n 

i=\ .r (10) 

/=! 

The system of equations (10) not only follows from the system (8), 
but is equivalent to it (the proof of this statement is similar to 
the proof of an analogous assertion in Subsection 8, Section 5). 

The possibility of passing from system (8) to other systems of 
the form (10) signifies geometrically that Pm may be defined as 
the intersection of distinct n — m sets of independent hyperplanes. 
The independence of hyperplanes is to be understood in the sense 
that the rank of the consistent system (10) of equations of these 
hyperplanes is of maximum value, that is, it is equal to the number 
q[ equations (r = n — m). 

§ 7. Mutual positions of planes 

1. Intersecting planes. Throughout this section, the dimensions 
of planes and subspaces will be indicated by subscripts. Let two 
planes Pi, and Pi in the affine space have a common point A. 
We take this point as the origin of the affine system of coordinates. 

When a running point M ranges over the plane P* (or P;), a vec¬ 

tor AM ranges over the subspace Li, (or L;). Therefore, the ques¬ 
tion of the mutual positions of two intersecting planes is naturally 
connected with a consideration of the subspaces Li, and Li in the 
vector space L„. 

Using the properties of subspaces (Chapter I, Sections 12 to 14), 
we can readily establish the following facts. 

(1) If planes Pi, and Pi intersect, then their- intersection is a 
certain plane P,„ (in Fig. 10 we have k = I = 2, m = 1). 

Remark 1. It may happen that P,„ consists of one point 

(m = 0). This is evident from the example of two intersecting 
straight lines or a straight line and a plane (Fig. 11). In the 
general case, two planes can intersect in a single point, the sum 
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of the dimensions of (he planes not exceeding the dimension of the 
space. For instance, two-dimensional planes in a four-dimensional 
space. 

Remark 2. We do not preclude another extreme case where one 
of the two planes lies entirely in the other. For instance, P* cr P/, 
k < /, then Pm = Ph (in Fig. \2, k = m = 1, Z = 2). 



Fig. 10 Fig. II 


(2) If the planes Pk and P/ intersect along the plane Pm. then 
there exists a unique plane Pr of dimension r = k1 — m which 
contains Pi, and A; the two planes Ph and Pi cannot simultane¬ 
ously lie in any other plane of smaller dimension. The direction 
subspace Lr of the plane Pr is the sum of the direction subspaces 



Fig. 12 Fig. 13 


Li, and Li. This sum is a direct sum if and only if P* and Pj in¬ 
tersect in a single point (m = 0, see Fig. 13). In the special case 
/; -f / — m = n, the role of plane Pr is played by the entire space 
VI„ (for r = n = 3, see Fig. 10). 

(3) If the intersecting planes Pi, and Pi lie in some plane Pr, 
(hen the dimension of their intersection k-\-1 — r. In parti¬ 
cular 

m^fe-j-Z —n (1) 

for any two intersecting planes in 
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(4) If the planes Pi, and Pi pass through a point A in the direc¬ 
tion of the subspaces Z,/, and Li respectively and if Li, lies in Li, 
then plane Pk lies in plane Pi. And if in that case k = I, then Ph 
coincides with Pi (also, Li, coincides with L/). 

2. Parallel planes. Now let plane Pi, be defined by the point A 
and the subspace L*. and plane Pi by the point B and the sub¬ 
space Li. We assume that I ^ k. 

Definition. The plane Ph is parallel to the plane Pi if Lh c Li. 
We will also allow for the statement, in that case, that the plane 
Pi is parallel to the plane P*. 

Remark I. According to this definition, the inclusion Pk a Pi 
is a special case of parallelism. 

Remark 2. If Pk is parallel to Pi, and k = I, then Lk coincides 
with /,(. 




Fig. 14 

Remark 3. It is easy to see that for n=3 the special cases 
k — I = I, k = I = 2, and k = \, I = 2 agree with the notion of 
parallelism of straight lines and planes in elementary geometry 
(Fig. 14). 

Suppose two planes P and P' of the same dimensions are given 
in an arbitrary affine system of coordinates by systems of linear 
equations. Taking advantage of the definition of parallelism, we 
can readily establish the following assertion. 

For P and P' to be parallel it is necessary and sufficient that 
the corresponding homogeneous systems of equations be equiva¬ 
lent. 

In particular, two hyperplanes are parallel if and only if they 
are given, in the same coordinates, by the cijuations 

Oi-'f] + • • • + n,iX„ -|- /; = 0 (2) 

and 

a\K I ... -f- a'nXn b' = 0 {2') 

with proportional coefficients of the variables: 



f)4 SYSTEMS OF LINEAR EQUATIONS [CH. Ill 

Kor the hypcrplancs (2) and (2') to be coincident, it is necessary 
and sufficient that all the coefficients of the equations be propor¬ 
tional; 



Theorem 1. Let there be given in the affine space ?I„ a plane Pk 
and a point B. Then there exists a unique plane Pk of dimension 
k that passes through point B parallel to P/,. If B ^ Pk, then P'k 
coincides with Ph\ if the point B lies outside Pk, then the planes Pk 
and P'k do not intersect. 

The proof has been so fully prepared by the foregoing material 
that there is no need to give it here. 

3. Skew planes. 

Definition. Two planes are said to be skew if they do not inter¬ 
sect and are not parallel. 

fn three-dimensional space ?l 3 , we know that two straight lines 
(one-dimensional planes, that is) can be skew, whereas a straight 
line and a two-dimensional plane in ISs cannot be skew. As the 
dimension of a space is increased, the space becomes more “roomy" 
and there is more opportunity to construct skew planes of different 
dimensions besides the one-dimensional variety. Theorem 2, below, 
may be regarded as a general procedure for the construction of 
skew planes. Suppose, in the affine space 2l„, we have a plane Pi 
{I < n). Let us take an arbitrary plane Pk so that Pk and Pi are 
not parallel and intersect; the plane along which they intersect is 
denoted by P,„. Let Pr be a plane of smaller dimension containing 
Pk and Pi. We know that r = k I — m. 

Theorem 2. If k1 — m d n, then any k-dimensional plane 
parallel to Pk and not lying in Pr is skew to Pi. 

Corollary. If the integers k. I, m, n satisfy the inequalities 

0^m<k, 0^m</, k-\-l — m<n 

then there exist in Wn the skew planes Pk and Pi with direction 
subspaces Li, and Li whose intersection — Lk 0 Li has dimen¬ 
sion m. 

Proof. Since r = k-\-l —mdn, plane Pr does not exhaust the 
whole of space VIThis enables us (with a great deal of arbitrari¬ 
ness) to take a point C outside of Pr. Denote by P'k a plane of 
dimension k passing through C parallel to Pk. It is clear that P'k 
is not contained in Pr and that by selecting C in different ways we 
can ohiain any A’-dimensional plane to satisfy the hypothesis of 
the Iheorem. (Sei' I'ig. 15 in which k = I = 2, r = 3, n = 4, and 
the Ihree-dimensional planes are depicted in the form of parallele¬ 
pipeds.) We will prove that the planes Pi and P'k are skew planes. 



MUTUAL POSITIONS OF PLANES 


95 


§ 71 

Note that plane P'k is not parallel to Pi, otherwise either Lh cz Li 
or Ltd Lh, which is contrary to the condition stipulating the posi¬ 
tions of the planes Pu and Pi. 

We now prove tiiat P* and Pi do not intersect. Drew through 
point C an auxiliary r-dimensional plane Pr parallel to Pr. Then 
P'kCzP'r and therefore P'k cannot intersect Pi for then the point 
of their intersection would belong to the parallel planes Pr and 
Hence, P'k is skew to Pi. Theorem 2 is proved. 

Suppose in an n-dimensional affine space ‘2l„ we have skew 
planes P* and Pi with direction subspaces Lh and Li, and P/, fl Li= 
= L„„ k + I — m < n. 



Fig. 15 

Theorem 3. There exists a unique plane Pr+] of dimension 
1 =(k-\-l — m)-j- 1 containing the planes Ph and Pi. 

Proof. Choose an arbitrary point A e P;, and fix an arbitrary 
point B in Pi. Denote the linear hull of the vector AB by L(AB) 
(Fig. 16). Suppose there is a plane P containing P/, and Pi. Let L 
be its direction subspace. Clearly, £ must contain Lh, Li and L{AB) 
and, hence, also the sum of these siibspaces. Denote this sum by 
Lt+\'. 

£r+i = Lk H" £/ + L (AB) cz L 

Conversely, if £ is any subspace including £r+i, then the plane P 
that passes through point A in the direction of £ will contain Ph 
and Pi. Indeed, since A e P and Lh cz £, it follows that Pi, cz P. 
Since A ^ P and AB e £, it follows that B ^ P\ since B ^ P and 
Li cz £, then P; c: P. 

We thus obtain, from among all planes P, the desired plane 
Pr+i of minimal dimension r -f 1 in the unique case where £r+i is 
taken for £. Let us compute r-|- 1. To do this, consider L'~ 
= Lh -f Li and denote the dimension of L' by p. By Theorem 3, 
Section 14, Chapter 1, we have /? = /? + / — m. Below we will 
show that £r+i = £' + L(AB) is a direct sum; and so the dimen- 
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sion of Z,,.+| is equal to p+1, that is to say, (r+l) = 
= (* + / 1 . _ 
it is thus necessary to establish that Lr+\ — U @ L{AB). To do 
so, it suffices to show that the vector AB does not belong to the 
subspace L'. Assume the contrary. Let AB g U. Then by the defi¬ 
nition of a sum of subspaces there exist vectors x, y such that 

.veLft, //eL/, AB = xAry (3) 

By the first axiom of an affine space there will be a point C such 
that AC = X and C e Pi,. By the second axiom of an affine space, 

-v- + ^ = AC -f- ^ ^ (4) 

Taking into account (3) and (4), we find that 

^ -1/ e A, (5) 

so that C e Pi. It turns out that the planes Pi, and Pi have a com¬ 
mon point C, but this is impossible since Pi, and Pi are skew 
planes. The proof of Theorem 3 is complete. 



•r*/ 

/ 


Fig. 16 

Remark. Fig. 16 is only a partial illustration of Theorem 3. For 
example, if the dimensions of Pi, and Pi exceed m and are distinct, 
m ^ 1, P, i \ ^ P ^ 91 „, then it is easy to compute that n'^7. It 
is impossible to give a complete drawing of such a situation. In 
the sequel we will frequently make use of drawings depicting 
figures in low-dimensional spaces (n = 2, 3, sometimes n — 4) in 
order to illustrate definitions and reasoning that refer to arbitrary 
n-dimensional s|)aees. 

4. The foregoing shows that the planes P* and Pi that are dealt 
with in Theorem 3 are not contained in any plane of smaller di¬ 
mension than /■ 4- I. 

And so the following theorem is valid. 
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Theorem 4. If the skew planes Pu and Pi lie in the plane Ps, 
then 

s X/fe + / - m) + 1 (6) 

(Here, as above, m is the dimension of the intersection fl L,.) 

Corollary. If in we have skew planes Pi, and Pi of positive 
dimensions, then 

/t<n-2, /<«-2 (7) 

The inequalities (7) follow from relation (6) for s = n since for 
skew planes we have k — m 1, / — m'^ 1. 

Special case. A hyperplane cannot be skew to any other plane 
of positive dimension. 

5. Retaining the notation of the preceding subsection, let us 
state a sufficient condition for the intersection of two planes. 

Theorem 5. If in are given planes Pi, and Pi such that 

k +I-m^n (8) 

where m is the dimension of the intersection L„, of the direction 
subspaces Lt, and Li, then Pt, and Pi intersect. 

Proof. Excluding the trivial case where one of the given planes 
coincides with the entire space, we have 

k <n, I <n (9) 

• Only three possibilities are permissible for the positions of the 
two given planes: 

Pi, is parallel to Pi, or 

Ph and Pi are skew planes, or 

Pi, and Pi intersect. 

If Pi, is parallel to Pi, then for the dimension m of the intersec¬ 
tion of the corresponding subspaces Li, and Li we have 

m = m\n(k,l) (10) 

and relations (9) and (10) contradict inequality (8). If Pu and Pi 
are skew to each other, inequality (6) holds for s = n, which is 
again a contradiction relative to (8). We are thus left with the 
assumption that Pi, and Pi are skew to each other. Theorem 5 is 
proved. 

Remark. It is easy to demonstrate that under the hypothesis of 
Theorem 5 the equation k I — m = n actually holds true. How¬ 
ever, in estimations it is easier to verify an inequality than an 
equation, and so we state the sufficient condition for intersection 
of planes in the form of an inequality (8). 


4—wi 
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6. Wo now turn to an algebraic interpretation of the theorem on 
the intersection of planes. 

Suppose we have two nonhomogeneous and separately consistent 
systems of linear equations whose ranks are equal to and r 2 . We 
combine these two systems; that is, we regard all the equations 
jointly. For this combined system we construct a corresponding 
homogeneous system and denote its rank by Fq. 

// Fu ^ Fi -f F 2 , then the combined nonhomogeneous system of 
equations is consistent. 

Indeed, if the given system of equations determine the planes Pk 
and Pi, then the homogeneous system corresponding to the union 
of the given systems determines L,„ = L,, H U. We accordingly 
have k = n — F|, I = n — Fj, m = n — Fq. Thus, k + I — m = 
= « + Fo—(F| F 2 ) ^ n and, hence, the planes Ph and Pi inter¬ 

sect, which signifies that the combined system is consistent. 

By way of an exercise, we leave it to the reader to prove in 
purely algebraic fashion the assertion stated in this subsection, 
relying not on Theorem 5 but on the Kronecker-Capelli theorem, 
and to verify at the same time that the inequality F| -j- F 2 ^ fq 
actually implies the equation Fi + F 2 = Fq. 

§ 8. Systems of linear inequalities and convex polyhedrons 

1. In this section we consider a real n-dimensional affine space 
assuming as given an affine coordinate system. 

2. Suppose a straight line through a point Xo^ having co¬ 

ordinates (.vj', ..., xJJ) is drawn in the direction of a vector /, 
whose coordinates we denote by {/i./„}. By Subsection 6, Sec¬ 

tion 3, this straight line may be specified by the parametric equa¬ 
tions 

x.=x°-\-xl., (I) 

- 00 < T < -f CXD 

l,(‘t certain points A and B be chosen on line (I). The corres¬ 
ponding values of the parameter t will be denoted by ti and T 2 . 
Suppose Ti < T 2 . 

Definition. The set of points of the line that satisfy the inequa¬ 
lities 

T, < T < T, 


is called a line segment AB. 

.3. If the point A has coordinates (oi, ..., a„) and the point B 
ha.s cooidinales (b, . b,,), then for the direction vector of the 
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line we can take the vector I ~ AB. Then U = 6, —a„ and for the 
running point of the line we have 

Xi = fl; + (&/ — fl,) T = (1 — t) a, + xbi 

We have x = 0 at /I and x = 1 at B so that the line segment AB 
is now given by the inequalities 0 ^ x ^ 1. Set I — x = a, x = p. 
Then for the points of the line segment AB, and only for them, we 
have 

Xi = aoi ( 2 ) 

1=1, a^O, P^O, a + P=l 

The point at which a = p = y is called the midpoint of the line 
segment AB. 

4. Definition. A set of points of real affine space is said to be 
convex if together with every two of its points A, B it also contains 
the line segment AB. 

The most elementary instances of convex sets are a line seg¬ 
ment, a plane of arbitrary dimension, the entire space 9l„. 

The set consisting of one point and the empty set are also re¬ 
garded as convex sets. 

From the definition it follows immediately that the intersection 
of any collection of convex sets is itself a convex set. Indeed, if 
^ points A, B belong to the intersection of some collection of convex 
sets, then the line segment AB belongs to each of these sets, and 
hence to their intersection. 

5. Given in the space 9l„ an arbitrary hyperplane 

A\Xf + • • • + A,,x„ -f- C = 0 (3) 

The hyperplane (3) divides the space into two parts called open 
half-spaces. Their points are described by the inequalities 

X! AiXi + C < 0 and X ^tXi -f C > 0 (4) 

respectively. By adjoining the hypcrplane (3) to an open half- 

space we get what is called a closed half-space. One of them 
consists of points whose coordinates satisfy the inequality 

Z -f C < 0 (5) 

the other, of points whose coordinates satisfy the inequality 

Ea,v, + C>0 (6) 

6. It is an essential fact that the space at hand is a real space. 
In the complex case, no hyperplane divides the space, Just as no 
straight line can divide three-dimensional real space. This means 


4 ’ 
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tiuit if the points A and B in complex space do not belong to a 
liypcrplanc, tiien they can be joined by a line that does not inter¬ 
sect this hypcrplane. Contrariwise, if in real space the points A 
and B belong to two different open half-spaces (4), then any curve 
joining A and B intersects the hyperplane (3). We omit the proof. 

7. Theorem 1. Every half-space is a convex set. 

We carry out the proof for the half-space (5). Let the points A 
and B belong to this half-space. Then 

and for an arbitrary point X of the line segment AB, taking into 
account (2), we get 

S + C = X + P^i) -f- C (a -L p) 

= a (Z Aiat -f C) -f p (Z A,b, + C) < 0 
Thus, the point X belongs to the half-space (5). But X was chosen 
arbitrarily on AB and so the entire line segment AB belongs to 
the half-space (5), which is what we set out to prove. 

8. Definition. The intersection (if it is not empty) of a finite 
number of half-spaces is a convex polyhedron. 




We coiiline ourselves to polyhedrons formed by the intersection 
of closed half-spaces. 

Pictorially, a convex polyhedron is a piece of space cut out by 
several hyperiilanes (for n = 3 see Fig. 17). This piece may extend 
to infinity (Fig. 18). It may also occur that the polyhedron lies 
entirely in some /Y-dimensional plane k <. n (see Fig. 19 for 
n = .3, k = 2). 

If we have m half-spaces given by the inequalities 

tl 

z AifXj -f Cj < 0, i = 1, ..., m 

/..I 


( 7 ) 
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then the polyhedron (the intersection of the half-spaces (7)) in¬ 
cludes those and only those points whose coordinates satisfy the 
system of inequalities 


i4|i-'^l + • • . + AinXn + C| ^0, 


( 8 ) 


On the other hand, if a system of type (8) is consistent, then it 
determines a polyhedron formed by the intersection of the half¬ 
spaces (7). 



Remark. It is clear that an inequality of type (6) can always be 
replaced by an inequality of type (7) by multiplying all its coef¬ 
ficients by (—1). 


9. A polyhedron is called an n-dimensional parallelepiped if in 
some affine coordinate system it is specified by inequalities of the 
form 

I 1 > • • •» ri (9) 

where %u q,- are scalars. In particular, we say that the parallele¬ 
piped is constructed on the independent vectors ei, ..., applied 
to the point 0 if it is specified by the inequalities 

0<a:,<1, (10) 

in coordinates with origin 0 and basis ei, ..., e„. 

The inequalities (9) can always be reduced to the form (10) by 
means of a transformation of the affine coordinates. 

When n = 1, an N-dimensional parallelepiped is a line segment, 
when N = 2, it is a parallelogram. 

That portion of the parallelepiped (10) located in one of the 
hyperplanes Xi = 0 or jr, = 1 is itself an (n — l)-dimensional pa¬ 
rallelepiped and is called the (n — 1)-dimensional face of the pa¬ 
rallelepiped (10). We can also consider the faces of these (n— 1)- 
dimensional parallelepipeds, the faces of their faces, and 50 forth. 
We thus obtain a collection of /e-dimensional parallelepipeds of 
different dimensions k, n — 1 ^ /e ^ 1. They are all called ft-di- 
mensional faces of the original parallelepiped (10). The one-di¬ 
mensional faces are called edges and their extremities are called 
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vertices of the parallelepiped. It can be shown that the vertices of 
parallelepiped (10) are points, and only those points, each of the 
coordinates of which is either zero or unity. 

Example. In three-dimensional Euclidean space with a specified 
rectangular Cartesian coordinate system (x, y, z), let us consider 
the rectangular parallelepipeds whose edges are parallel to the 
coordinate axes. Let (xo, yo, Zo) be the coordinates of the centre 
of the parallelepiped, and a, b, c, the lengths of the edges parallel 
to the axes x, y, z respectively. Denote by si- the set of parallele¬ 
pipeds of this kind whose centres lie in the cube |a:|^ |, |«/|^ I, 
|z| I, and the lengths of the edges do not exceed t). To each pa¬ 
rallelepiped of the set si- there can be associated a point of a six¬ 



dimensional affine space 916 with coordinates {xo, yo, Zq, a, b, c). 
Then the set si itself can be regarded as a six-dimensional paral¬ 
lelepiped: 

O^a^T), 0 ^ ^Tj, O^c 


Note that geometric figures of one space are often conveniently 
regarded as points of another space. 


10. Definition. A set of points in the affine space 9t„ is said to 
be bounded if the coordinates of all the points of the set satisfy 
the ineijuality |.v,|<;: M (M a number greater than zero). 

It is easy to verify, through the use of the formulas of Section 2, 
that this (leliiiition does not depend on the choice of the affine co¬ 
ordinate system. A set is bounded if and only if it is contained in 
a certain parallelepiped. 


II. Definition. The convex hull of a set si of points in the affine 
space 91 is defined as that convex set cr 91 which is contained in 
any convex set containing si. 

in other words, the convex hull si is the intersection of all pos¬ 
sible convex sets containing the given set si. We also say that si 
is the smallest convex set containing (Fig- 20). 
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Example. The convex hull of two points A, B is the line seg¬ 
ment AB. 

It can be proved that the convex hull of any finite number of 
points is a bounded convex polyhedron and that any bounded con¬ 
vex polyhedron of type (8) is a convex hull of some finite system 
of points called its vertices. 

12. We show a geometric construction that is frequently found 
to be useful in dealing with convex hulls. 

Given a convex set and a point Af. Construct all possible line 
segments of the form MX, X ^ denote the set of points of all 
such line segments by (Fig- 21). Then the following theorem 
holds. 


M M 



X 

F!g. 21 Fig. 22 


Theorem 2. The set ^ is a convex hull of the union [J M. 

Proof. \\ M^s4-, then ^ = si and the assertion of the theorem 
is obvious. Suppose M does not belong to the set si. Any convex 
set containing si [J M must contain all Therefore it suffices to 
verify the convexity of 3S. Let the points A, B ^ Then A lies on 
some line segment MX and B lies on some line segment MY, 
where X, si (Fig. 22). We have to establish that the line 
segment AB lies entirely in the set Let C be an arbitrary point 
of AB. Then (if we exclude the trivial cases where one of the 
points A, B coincides with one of the points M, X, Y) we have 

AiA = kMX, 0 < A < 1. 
m = nm, 0<p<l; 

MC = ai^+^m, a>0, p>0, a + § = l 

There will be a point Z on XY such that 

V ' V 
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wIktc V = a/. 4- P|i, 0 < V < I. The point Z is contained in the 
set &!■ since tlie latter is convex. It is readily seen that 

yMC= 

which means that the point C belongs to the line segment 
MZ cz 3S, and this concludes the proof of Theorem 2. 

13. We note some elementary properties of a convex hull. 

(!) A set coincides with its convex hull if and only if it is 
convex. 

(2) If cz s^i, then the convex hull of the set si-x is contained 
in the convex hull of the set si-^. 

These two properties follow directly from the definitions of a 
convex set and a convex hull. _ 

(3) Let = si-i and let _^| be the convex hull of the 

set Then the convex huU st- of the set coincides with 
the convex hull of the u^nion .9^, U _ 

Prvof. s^xUsi^-.yC Therefore is containied in the 

convex hull of the set ^,11 ■5^2- On the other hand, is a con¬ 
vex set that contains sf-x and s^-y. Therefore the convex hull of 
the union 'S contained in Thus, the set .9^ and the 

convex hull of the set ,9^|U-5^2 coincide. 

14. Given, in an affine space 21, the points Ao, Ax . Ap with 

radius vectors ao, Oi, ..., Op respectively. The following theorem 
holds. 

Theorem 3. The convex hull of the system of points Ao, Ax,..., Ap 
is given by the formula 

x = ooao + 0,01-f ... -fopap (11) 

where x is the radius vector of an arbitrary point in the convex 
hull, and the numbers ao. ap satisfy the conditions 

••• +“p=’- ] 

ao>0, a,>0.ap>0 J 

The proof is carried out by means of induction on the number of 
points. Theorem 3 holds true for two points since for p = 1 for¬ 
mulas (11) and (12) specify the line segment AoAi. 

Let Theorem 3 be proved for p -f 1 points. Consider the points 

Ao . Ap. Denote their convex hull by Add another point 

Apxx with radius vector Op+i and construct the convex hull ^ of 
tile union .9/ U .v^pn- By Property 3 of Subsection 13, the set ^ 
coincides with the convex hull of the system of points Ao,Ai, ..., 
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Ap,Ap+\. By Theorem 2, the set consists of all possible line 
segments Ap+\X, where X. ^ The radius vector x of point X is 
given by equation (11). Denote by y the radius vector of an ar¬ 
bitrary point of the line segment Ap+\X. Then 

y = a,v-f a-fp=l, a>0, p>0 (13) 

Put 

P< = aa,, / = 0, .... p; Pp+i=P (14) 

From formulas (11) to (14), we get 

*/ = Mo+ ••• + Ppflp + I 

Pj + •. • + Pp + Pp+i = 1 , P/ ^ 0, / = 0, ..., p + 1 i 

Thus, every point of the set 3S satisfies the relations (15). Con¬ 
versely, substituting the quantities (14) into (15), we get for y 
an expression of type (13), where x satisfies the relations (11) 
and (12). This means that every point that satisfies conditions 
(15) belongs to the set which completes the proof of Theorem 3. 

15. Definition. The convex hull of a set of points Ao, Ai, ..., Ar 
lying in the general position is called an r-dimensional simplex 
with vertices ^o. .^i, • • •, Ar- 

From Theorem 3 it follows that a simplex with vertices 

Ao . Ar is specified by the formulas (11) and (12) for p — r. 

Here, the numbers ao.ar are called the barycentric coordina¬ 

tes of the point of the simplex having the radius vector x. 
Particular cases: 

a zero-dimensional simplex is a single point, 
a one-dimensional simplex is a line segment, 
a two-dimensional simplex is a triangle, 
a three-dimensional simplex is a triangular pyramid. 

The point of a simplex at which all barycentric coordinates are 
equal ^Oq = ... =ar==-jr| 7 y) is called the centre of the simplex. 

Let Tr be a simplex with vertices Ao,Ai,... ,Ar and let /l,^, Ai^,.. 
Aii^ be certain of its vertices. The fe-dimensional simplex which 
is a convex hull of the vertices .4,^, Ai^, ..., is called a k-di- 
mensional face of the simplex Tr. 

One-dimensional faces, that is to say, line segments joining ver¬ 
tices, are called the edges of the simplex. 

Two faces of dimensions k and r—(k+ 1) are called opposite 
faces of the simplex Tr if they do not have any vertices in common. 
As an exercise, the reader is advised to prove that a simplex is a 
convex hull of a pair of opposite faces, that the opposite faces of 
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ii simplex always lie in skew planes, and that the line segment 
joining the centres of opposite faces passes through the centre of 
the simplex. 


16. We will prove that an n-dimensional simplex in n-dimen- 
sional space is the intersection of n + 1 closed subspaces. 

Let Aq, A\, ..., An be the vertices of the simplex Tn- Take Aq for 
the coordinate origin and choose the basis as follows: 

e^ = AnA_, 62 = AoAy, .... e„ = AoA„ 


Then the relations (11) and (12) (for p = n) assume, in coordi¬ 
nates, the form 

Xi = a,, 

X2 = 02, 


Xn Q,i, 

Oo + ®1 + • • • + Ofi = 1 , 
oto^O, Oj ^ 0, ..., 


(16) 


whence it follows that 


Xi>0, a: 2>0.-Vrt^O, I 

-Tl + A'2 -f- . . . + -Vrt ^ 1 1 

On the other hand, (17) implies (16) if we put at = Xi for 
t=l, ..., n, tto = 1—(Xi-|-.x :2 + . •. + a:,,). Thus, the systems 



(16) and (17) arc equivalent and specify one and the same 
simplex T„ (for ti = 3 see Fig. 23). 

Tlie system of inequalities (17) shows via the intersection of 
which half-spaces the simplex Tn is formed. 

17. We have already mentioned that a polyhedron can be pic¬ 
tured as a piece of space cut out by several hyperplanes. It can be 
proved lhat if a polyhedron is bounded, then the number m of the 
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cutting hypcrplanes, that is, the number of half-spaces the inter¬ 
section of which forms the polyhedron, must exceed the dimensio¬ 
nality of the space. 

The simplex corresponds to the smallest possible number m = 
= n 

18. Let a polyhedron be specified by a system of inequalities of 
type (8), and suppose there is a function 

2 = Cu + C|A:,-f ... -f c„a:„ (18) 

where Cq, ci, ..., c,, are numerical coefficients, and Xi, ..., x„ are 
the coordinates of the point in ‘.ll,,. 

The problem of finding the maximum and minimum of function 
(18) on the polyhedron (8) is of such great importance in applica¬ 
tions (economics, for example) that the investigation of this pro¬ 
blem and the development of numerical methods for its solution 
now constitute a whole field of research called linear program¬ 
ming. 

Note on the other hand that the geometrical theory of convex 
polyhedrons is a substantial aid to the algebraic theory of linear 
inequalities. 



Chapter IV 


LINEAR, BILINEAR 
AND QUADRATIC FORMS 


§ I. Linear forms 

1. Suppose that in a linear space L is given a numerical func¬ 
tion of a vector argument, that is, to every vector x there is as¬ 
sociated a number a{x). 

In this chapter we regard the function a{x) in the generally 
accepted manner, namely we consider it invariant. This means 
that the value of a(x) does not depend on the choice of basis in 
the space L. 

Remark. In some of the chapters later on we will have to give 
up this generally accepted viewpoint and regard functions whose 
numerical value is determined for a given x^L (or for given 
X, y, ^ L) by means of a basis in L and may depend on the 
choice of basis. Incidentally, in this case as well we can revert to 
the generally accepted viewpoint by extending the concept of the 
domain of definition of a function. Indeed, if by If we denote the 
set of all bases of space L, then we can consider the function 
a{x, e), where x^L, e e If. We obtain the ordinary (invariant) 
function a{x) if a(x, e) = a(x) for all e e If. 

Definition. A function a(x) is said to be linear if: 

(1) a(.v-f//) = a(x)-f a(t/) for any vectors x, y in L; 

(2) a(ax)= aa{x) for any scalar a and any vector x in L. 

For values of the function a(x) we will take real numbers if L 

is real and we will admit complex numbers if L is complex. 

2. Examples. (1) Let jc = Xi^i + • • • + where ei, ..., e„ 
is a basis in L. In each basis ei, ..., put a{x) = X\. Then the 
properties (1) and (2) of Subsection 1 hold true for a{x), but a(x) 
does iiol satisfy the definition of a linear function since it depends 
on the cliosen basis. 

(2) Let L be (he space of polynomials of degree not exceeding n. 
Let every polynomial x(t) in L be associated with a number a{x) 
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by the formula 

a (.t) = ^ AT (t) rft (1) 

t| 

where Ti ^ t ^ T 2 is a given interval on the number axis. It is 
clear that the numerical value of a{x) does not depend on the 
choice of basis in L. Conditions (1) and (2) of Subsection 1 hold 
due to the familiar properties of a definite integral. Thus, function 
(1) is a linear function in the space L. 

Remark. The linear function (1) may also be considered in the 
infinite-dimensional space of continuous functions specified on an 
arbitrarily chosen interval [t', t^] subject to the condition that 
x'^T,^ or in the space of ail functions integrable on 
[Xp t'] (this too is an infinite-dimensional space). 

3. Given in space L a linear function a(x). Assuming that L is 
n-dimensional, fix in it an arbitrary basis ei, ..., and expand 
the vector x in terms of this basis: x = X\e\ + ... + Then 
the linear function will be written as 

a (x) = a -f ... -j-x„e„) = X|a(e|) 4- ... +x„fl(0 (2) 

Denote by a,- the value of the function a(x) on the basis vector e,: 

a,=a(e,), a„ = a(ej (3) 

If the basis is fixed, then a,- represent quite definite numbers. Sub¬ 
stituting the quantities (3) into (2), we get an expression of the 
function a(x) in the form of a homogeneous polynomial of first 
degree in the components (coordinates) of the vector x: 

a (x) = a,.V| + a.^2 -f ... -f a„.x„ (4) 

4. Homogeneous polynomials of degree k are generally called 
forms of degree k. For ^ = 1 we have the term linear form, for 
k = 2, the quadratic form. 

According to formula (4), every linear function a{x) in n-dimen- 
sional linear space is a linear form in the components of its argu¬ 
ment x. 

In this connection, it is usual to call linear functions linear 
forms. 

5. In the space L„ we pass to a new basis e\, ..., e'n by the 
formula 

= Z 


(I) 
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(see Section 5, Chapter 11). In the new basis, the linear form will 
have new coefficients aj: 

a (x) = a'lX'i -f- 02 X 2 + • • • “t" (InX'n 

Let us find the a'l using the fact that these numbers are the values 
of the form a(x) on the new basis vectors; 

a'i = a (e'i) 

Using expression (I) for the vectors e'i and taking advantage of 
the linearity of the function a(x), we find 

a'i = a{Z = Z Pi,a (e,) = I 

Thus 

a'^'LPiia, (5) 

We see that (5) is fully analogous to (1). 

6 . We will now prove that the law of transformation of the coef¬ 
ficients as expressed by formula (5) ensures the invariance of the 
values of the function, which in the basis e\, ... , is given 
by (4). 

For this purpose let us use the formulas (III) and (4) of Sec¬ 
tion 5, Chapter II. In the new basis set 

a (•«) = Z a'ix'i = Z (Z Pii^i'j (Z (6) 

Note that in ( 6 ) and in other similar cases the summation indices 
in the brackets must be denoted by different letters to avoid con¬ 
fusion when the brackets are removed. Removing brackets and 
regrouping, we have 

Z a'ix'i = ^ Z^ a,XkPijQik = Z (a,Xk Z PnQik) == Z a,x,6ik (7) 

where 6 , 7 , is the Kronecker delta. If / ^ k, then 6 j/( = 0 and these 
terms arc disregarded. If j = k, then 6,, = 1 so that ajXjdjj = ajXj. 
Therefore 

JlaiXk^lc = 'LaiXi ( 8 ) 

/.A y 

Comparing ( 6 )-( 8 ), we finally get 

« W == Z = Z O/-*’/ 

i i 

which states that the numerical value of a{x) is preserved under 
a change of basis. 

7. in a linear space L (which may be infinite-dimensional) let 
us now consider all possible linear forms, that is, numerical linear 
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functions of one vector argument. We will regard a sum of func¬ 
tions and the product of a function by a scalar in the ordinary 
(arithmetical) sense. We then have 
Theorem 1. The set L* of all linear functions specified in the 
space L constitutes a linear space. 

Proof. We first demonstrate that the sum of two arbitrary linear 
functions a{x), b{x) is a linear function. Set 

c(x) = a (a:) + b (.v) 

Then 

c (x tj) = a{x -{■ ij) -{■ b(x + y) = {a (x) -f a (»/)] -f \b (a) -f b (y)] = 
= [a {x) -f b (v)] -f [a (//) -f b (y)\ = c(x)-\-c iy) 

Besides, 

c (ax) = a (or) -f b (ar) = aa (x) + ab (jc) = a[a{x)-^b (jc)] = ac (x) 

Thus, the linearity of the sum is proved. 

We now show that if a linear function is multiplied by an ar¬ 
bitrary scalar K, the result is a linear function. Let c{x) = Xa(x). 
Then 

c{x + y) = la(x + y) = Ka (x) + Xa (y) = c{x) + c (y) 
Furthermore 

c (av) = Xa (cvc) = Xm (x) = ac (jr) 

We have thus demonstrated that Xa(x) is a linear function. Now, 
if a(x), b(x)^L*, then a{x)-\-b{x)^ L* and Xa{x)^ L* for 
any X. 

The zero element of L* is a (linear) function 0(x) equal to zero 
for every vector x. 

The function (—l)-a(A:) is the negative of a{x). 

It is easy to verify that all the axioms of a linear space hold 
true for L*, whence follows the validity of Theorem 1. 

8 . Definition. The linear space L* of all linear functions defined 
on L is called the confugate space associated with L. 

Remark. According to the definition of a linear function, multi¬ 
plication in a conjugate space is admissible by the same scalars 
as in the original space. In other words, if L is real, then L* is 
real, and if L is complex, then L* is complex. 

9. Theorem 2. If a linear space is n-dimensional, then the asso¬ 
ciated conjugate space is also n-dimensional. 

Proof. We introduce the basis t’l, ..., Cn in L and expand in 
terms of this basis an arbitrary vector x in L: 


X = .v,e| -f .v,c, -f ... + .\-„e„ 
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Then an arbitrary vector a from the conjugate space L*, that is 
to say, a linear function a{x), can be written as 

a (X) = fli-Vi + 02-^2 + . . . + OnXn 

and is uniquely defined by specification of an n-tuple of coeffi¬ 
cients (a.. a„). This n-tuple may be regarded as a vector in 

the coordinate space K,,- When linear functions are added or mul¬ 
tiplied by a scalar, their coefficients are also added and multiplied 
by that scalar. Therefore, in the given case, L* is isomorphic to 
the coordinate space K„. Theorem 2 is proved. 

to. To conclude this section we consider the geometric meaning 
of a linear form. To do this we will take advantage of the affine 
space ‘^t„ and consider vectors of L„ to be the radius vectors of 
points in laid off from a certain point 0. We assume that the 
value of the function a{x) at the point A is equal to its value on 
the vector x — OA. The function a(x) will thus be defined in ?l„. 

The following assertions are valid. 

1) The set of points at which the linear function a{x) assumes 
a constant value constitutes a hyperplane. 

(2) Every hyperplane is a locus of points at which a certain 
linear function retains a constant value. 

(3) Hyperplanes corresponding to distinct values of a given 
linear function afjr) are parallel. 

(4) The hyperplane on which a{x)= 0 passes through the coor¬ 
dinate origin. 

To prove these facts it suffices to write the equation ci{x)= c in 
terms of the components: 

Ui.v,+ fl 2 -^ 2 + ••• +a„x„ = c 

and take advantage of the results of Sections 6 and 7 of Chap¬ 
ter 111. 

§ 2. Bilinear forms 

I. A numerical function a{x, y) of two vector arguments x 
and y is said to be bilinear if it is linear in each of the arguments, 
that is, 

(1) fl (a,-| -t- .v.„ //) = a(x^, y) +a {^ 2 , y), 

a (ax, y)=aa (x, y)-, 

(2) «(.r, i/| -f //.,) = a (x, y.) + a (x, y^, 

Cl (x, (///) = aa (.V, //). 

Here, .v, //, .Vi, .V 2 , iji, 1/2 are any vectors in the space L and a is 
an aihilrary scalar. 
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2. Let L be an n-dimensional linear space and e\, ..., e„ a basis 
in it, and let the arguments of the bilinear function be expanded 
in terms of this basis: 


Then 


X = 11x10,, y^HtJiei 
a (x, y) = a(ll x,e,, H (/ftC*) = H Xiy^a {e,, e^) 


( 1 ) 


We introduce the notation 


aik = a{e„ei,) ( 2 ) 

and obtain 

n 

a{x,y)= H UikXity,, (3) 

I , *=1 

Formula (3) expresses the function a{x, y) in terms of components 
relative to the given basis. 

The polynomial in the right member of (3) is called a bilinear 
form. Also, the function a(x, y) itself is called a bilinear form. The 
numbers a,* are called the coefficients of the given form relative 
to the basis e,, ..., e,,. The arguments x and y may be regarded 
as vectors of real linear space or of complex linear space. Ac¬ 
cordingly, we say that the form a{x, y) is given in real space or 
in complex space. In the latter case, complex numbers are admis- 
*sible as values of the form a{x,y). Likewise, the coefficients Oth 
are also, generally, complex numbers. 

3. It is easy to demonstrate that the set of all bilinear forms 
specified in a linear space L also forms a linear space (if we un¬ 
derstand addition of forms and multiplication by a scalar in the 
ordinary arithmetic sense; see Section 1 where the proof is carried 
out for linear forms). 

4. Let us consider, in the given basis eu ..., e„, the one-term 
bilinear forms 

ln(x,y) = xiyk (4) 

From (2) and (3) we have 

a {x, y)='L «ikfik {x, y) (5) 

If we take x = Oi, y = e,,, for any fixed I and m, then hm = 1 
and all other forms of (4) will be zero. From this it follows that 
the forms (4) are independent and so they form a basis in the 
space of bilinear forms. Formula (5) gives the expansion of the 
bilinear form (1) in terms of tlie basis (4). 
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The basis (4) consists of elements, and so the space of bili¬ 
near forms has dimension n*. 


5. A bilinear form a{x, y) is said to be symmetric if for any x, 
y^L 


a(x, y) = a(y, x) 


y 


A bilinear form a(x, y) is said to be skew-symmetric if for any x, 
e L 

a (x, y) = — a {ij, .v) 


In the case of a symmetric bilinear form, the coefficients are 
symmetric: an, = am (see formula (2)). For a skew-symmetric 
form, an, = — a,,i and, in particular, a,-, = 0. Both symmetric and 
skew-symmetric bilinear forms form subspaces of the space of all 
bilinear forms with arguments in L. To find the dimensions of 
these subspaces, we construct bases in them. 

A symmetric bilinear form may be written as 

a (x, y) = Z {Xiiji, -f Xktji) -f S auXiiji (6) 

i<k 

Consider the forms 

Tik (x, y) = xty^ + x,,yi 

Tiiix, y) = xtyi 

The bilinear forms (7) are linearly independent and symmetric 
and any symmetric bilinear form is expressible in terms of these 
forms by a formula of type (6). For this reason, the forms (7) con¬ 
stitute a basis in the subspace of all symmetric bilinear forms. The 

number of elements in the basis of (7) is equal to + n = 
= y n (rt -f 1). Such also is the dimension of the subspace of sym¬ 
metric forms, whence it follows that for any choice of = -^ n («+l) 

indepeiideiit symmetric bilinear forms W\{x,y), .... WN{x,y), 
an arbitrary symmetric form can be represented as 

« (->f. «/) = Z {x, y) 

1=1 

where Ki are numerical coefficients. 

I'or skew-symmetric bilinear forms we have 

y)= H (likiXiiyk — Xkyi) 

i < ft 


i ¥= k, I 


(7) 
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and for a basis we can lake (lie forms 

Lk (x, y) = x,y^ — Xktji 

They total N =^n(n— 1). Hence, for any choice of the inde¬ 
pendent skew-symmetric forms sy,, .... Wfj we get for an arbitrary 
skew-symmetric form a{x, y) the representation 

N 

a (Jc, «/) = Z U. V) 

1=1 

6. Theorem. The space of bilinear forms is a direct sum of the 
subspace of symmetric and the subspace of skew-symmetric bili¬ 
near forms. 

Proof. Clearly, a bilinear form is simultaneously symmetric and 
skew-symmetric only when it is zero, whence it follows from Theo¬ 
rem 1, Section 14, Chapter I, that the sum of the subspaces under 
consideration is a direct sum. 

On the other hand, any bilinear form a{x, y) may be represented 
as the sum of a symmetric and a skew-symmetric form, namely 

a {x, y) = \[a (x, y)-ha (y, x)] + [a (x, y) — a {y, .r)] 

Hence, the direct sum of these subspaces coincides with the entire 
space. The proof of the theorem is complete. 

7. Now let us pass to a new basis: 

e'i = l.Piie, ( 8 ) 

In the new basis. 

Because the form a (a-, y) is invariant, we have 

a (x, y) = Z a.^xpj^ = Z a'.^x\y'^ (9) 

where are new coefficients. Quite naturally, the invariance of 
the form a{x, y) does not signify the invariance of its coefficients 
(generally, ^ Let us express the coefficients a'^ in terms 
of the old coefficients a,;,. We take advantage of the fact that the 
values of the form on the basis vectors coincide with the coeffi¬ 
cients of the form 

a'ik^ate'i, ek) (10) 

In place of the new basis vectors e't, e'k we substitute into (10) 
their expressions (8): 

a (e'i, e'k) == a (Z Pa^i, Z Pkiei) 
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Now, since the form a{x,y) is bilinear in each of the arguments, 
we get 

Q {fii, Bh) = J]] PijPifiQ, (ej, 6i) 

Thus 

a'i,= t a„P,.P„ (I) 

1 . 1=1 

Remark. The expressions (I) can be obtained in a different way, 
by proceeding from (9). Since by Section 5, Chapter II, 

•'^/ = X Pij^i yi~'£dPkiyk 

we have, from (9), 

Z = Z = Z 

whence we again find (1). Yet it is evident not only that (1) 
follows from (9), but that (9) also follows from (1). Thus, the in¬ 
variance of the form implies the law of transformation of its coef¬ 
ficients by equations (I). In turn, the transformation of coefficients 
by (I) guarantees the invariance of the form. 

§ 3. The matrix of a bilinear form 

1. Given an arbitrary bilinear form. Expanded, it looks like this: 

y) = Y,aikXiyk 

= a|,X|i/|-|-a,2X,i/2+ ... 

+ a2\X2y\ + a<^2y2 + • . . + a.2nX2lJn 

”1” ^n2Xny2 “!"••• “f" ^nnXnyn 

Writing out the coefficients in the form of an array, we get a 
square matrix called the matrix of the bilinear form: 

a,I ni 2 ... Oirt I 

^ _ <121 <*22 • ■ • «2n I 


II ^n\ ^n2 * • * ^nn II 

In a given basis, the matrix fully determines the bilinear form 
since it yields all the coefficients. 

2. Suppose wo pass to a new basis: 

^< = Z Pii^i 
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In the new basis, the form a{x, y) has a different matrix, 
A' = \\a'ik\\. The elements of the matrix A' are expressed by 
the formulas (I) of Section 2. Let us transform these formulas so 
as to obtain a matrix relation that expresses (I) in its entirety. 
Write (I) as 

(I) 

t.i 

and introduce the quantities 

Clk = Z a,iPki (a) 

By (1) and (a) we have 

a'lk = Z PiiCik (P) 

Now form the matrix C = ||CjaI| with the usual convention that 
the first index stands for the row number and the second for the 
column number. 

Relations (a) and (P) will now be considered from the stand¬ 
point of the multiplication of matrices. When the index / is varied, 
Oj/ runs over a row of matrix A, and P/,/ runs over a row of 
matrix P. Thus, in (a) we have the product of a row into a row. 
To convert a row into a column it suffices to take the transpose 
of the matrix. Accordingly, in relation (a) we will regard the se- 
• cond factor under the summation sign as an element of the matrix 
P* (the transpose of P). We then get the product of a row of A 
by a column of P*. In other words, (a) is equivalent to the matrix 
equation 

C = AP* (a,) 

. Now consider formula (p). It is immediately apparent that on 
the right we have a product of a row by a column, and so from (P) 
it follows that 

A' = PC (P,) 

From (ai) and (Pi) we get the desired formula: 

A' = PAP* (2) 

Formula (2) expresses the matrix of the bilinear form in a new 
basis in terms of the matrix P and of the matrix of this form in 
the old basis; the change from the old to the new basis is made 
with the aid of P. 

3. Conclusions from formula (2). Note that P and P* are non¬ 
singular matrices. From this fact and by the theorem on the rank 
of a matrix product (Chapter II, Section 4) we have 

rank A' = rank A 


(3) 
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Definition. The rank of a bilinear form is the rank of its matrix. 
Because of (3), the rank of a bilinear form is an invariant rela¬ 
tive to change of basis and is thus a quantity related to the form 
itself, irrespective of its coordinate representation. Somewhat later 
(in Section 11) we will give a geometric interpretation of the rank 
of a bilinear form. 

4. Let us consider the determinant of the matrix of a bilinear 
form in some basis: 

A = det /I 

In another basis, A' = det A'. From (2) and from the theorem on 
multiplication of determinants it follows that 

A' = A • (det Pf (4) 

Thus, the determinant of the matrix of a bilinear form is not an 
invariant but changes with a change of basis by formula (4). 

5. Given a bilinear form a{x, y) = ^ 011 X 11 // whose determinant 
0, and an arbitrary linear form b{x) = '£,biXi. We can then 

choose y so that a{x, y) = b{x) for any x ^ L (with y fixed). This 
can be done by finding {y\ .i/„) from the system 

Z «./'// = bi 

whose determinant A ¥= 0. Hence, one bilinear form a{x, y) con¬ 
tains, as it were, all possible linear forms specified in L. 

§ 4. Quadratic forms 

1 . Given a symmetric bilinear form a{x, y): a{y, x) = a{x, y). 
This is equivalent to its matrix being symmetric in any basis:- 
A* = A. Indeed, 

n-ik a CfA = a (e^^, ^i) 

Identifying the two arguments of the form a{x, y), we get 
a(x. x) (i{x, y) for y = x. 

Tlio fniu tion a(,Y, x) is called a quadratic form and corresponds 
to the given symmetric bilinear form a{x, y). 

The original (symmetric) bilinear form a{x, y) is said to be the 
polar of the quadratic form a(x, x). 

2. We will prove that the polar bilinear form is uniquely de- 
leiTiiined by its (|iiadratic form. 

Suppose we have a numerical function f{x) of a vector argu¬ 
ment, and suppose f(x) is some quadratic form, i.e., /(x) = 
= (I(A, ,v), and <i(a', ij ) is unknown. To find it, we consider 
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/(^ + i/). where x, y are arbitrary vectors. Taking advantage of 
the properties of a bilinear form and its symmetry, we have 

f{x + y) = a{x + y,x-\-y) = a (x, x) + a (x, y) 

+ a {y, x) + a {//, y) = f (x) + 2a {x, y) + f (y) 
whence the desired expression: 

a (x, y) = j[f{x + y) — f{x) — f (//)] (1) 

3. Formula (1) may be taken for the definition of a quadratic 
form. Namely, we can say that a quadratically-homogeneous func¬ 
tion f(x) (that is, a function f such that f(ax) — a^f{x) for any 
number a) is called a quadratic form if the right-hand side of (1) 
is a bilinear function. 

Note that the definition of a quadratic form does not provide 
for a basis, which means it is applicable in infinite-dimensional 
spaces. 

4. Example. Let L be a linear space of functions continuous on 
the interval [0, 1]. 

Consider the function 

I 

f{x)=^ [x{t)fdt 

(■ 

, the argument of which x = x(/)e L. Here it is clear that f{ax) = 
= a^f(x). Besides, we have 

I 

4 If (x + y) — f (x) — f (y)] = ^x{t)y (t) dt (2) 

(» 

It is easy to verify that the right member of (2) is a bilinear 
form. Thus, /(x) is a quadratic form in the infinite-dimensional 
space L. 

Later on (Section 10) we will see that important implications 
follow; for instance, it will be possible to prove integral inequali¬ 
ties with the aid of purely algebraic theorems. 

5. Let us return to the n-dimensional case. In n-dimensional 
space we consider a quadratic form and write its expression in 
terms of the components of the arguments. 

Let a (X, y)= a{y, x), X = y. Then 

f(x) = a (x, x) = Z dikXiXi 

= ai|X|X| -f a|2X|X^,-f ••• ■^-0]nX\Xn 
+ a2|X2.r| + a^pL-jXj + ... + a.2„x^n 

+ 0,i\X,iX\ -+- i/nj.V„X2 -f- . . . 0„a.V„.V„ 
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If we take into account the symmetry of the coefficients, the 
terms of the sum (3), with the exception of the diagonal terms, 
naturally combine into pairs. We then obtain a frequently used 
notation for quadratic forms which looks like this: 

f{x) = a^^x] + 2a^^x^X2 + 2a^^x,■<.,+ ... -r2a^^x^x^ 

... ~{'2a2^x^^ ... 


Note that in the first row of (4) we have all the terms contain¬ 
ing X|. 


6 . The matrix of a quadratic form is the matrix of its polar bili¬ 
near form: 


flu • 


flfll • 

* ^nn 


From this definition it follows immediately that the matrix of a 
quadratic form transforms by the formula 

A' = PAP* 


which was proved in the preceding section. 

7. By definition, the rank of a quadratic form is equal to the 
rank of its matrix: r = rank A. 

8 . Quadratic forms have important geometric applications which 
will be considered below in Chapters VIII and XI. For the present 
we will not relate any geometric objects to quadratic forms and 
will examine their properties from the algebraic point of view. 

9. If in a certain basis it turns out that all coefficients a, a, = 0 
for I =/= k, then we say that the quadratic form is canonical in that 
basis: 

/(A:) = a„x2 +022^2 

In order to obtain the canonical form of a quadratic form, the 
basis must be chosen in a special way. In an arbitrary basis, a 
quadratic form is complete, that is to say, it has all terms, 
generally speaking. 

The rediictiuii of a quadratic form to canonical form is an im- 
poiianl problem of both theoretical and applied mathematics. 
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Below we give (wo methods for reducing a quadratic form to 
canonical form: the Lagrange method and the Jacobi method. 


10. If a form has been reduced to the canonical form, then its 
matrix becomes diagonal: 


«n 

U22 

0 

0 

^nn 


(5) 


Since the rank of a quadratic form is an invariant, it is equal to 
the number of nonzero diagonal elements of matrix (5). 

If the rank = r < n, then after an appropriate change of the 
numbers of the entries the matrix (5) may be written as 


an 


0 


flrr 



0 


0 


0 


11. Remark. If a quadratic form is reduced to canonical form, 
then its bilinear form is at the same time reduced to diagonal 
form: 

a(Jf, y) = n,lATii/i + 022 ^^ 21/2 + . •. + o„„x„tjn 

§ 5. Reducing a quadratic form to canonical form 
by Lagrange’s method 

1 . Given a quadratic form f(x) = a{x, x). By formula (4), Sec¬ 
tion 4, we can write f{x) in any basis as 

f (x) = 2 ^ 1 X 2 + • • • + 2a,„A:,.v„ -f g (JCj, .. ., x„) (1) 

where g is a quadratic form that does not include ati. 

The notation (1) makes it possible to prove that a quadratic 
form can be reduced to canonical form by induction. 

Theorem. Every quadratic form can be reduced to canonical 
form by means of a nonsingular linear transformation. 

Remark. Here it is a question of transforming variables, namely 
the numerical arguments Xu .. ■, Xn of the polynomial (1). But the 
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IIk'oioiii Ctin also be understood geometrically since any nonsin- 
giilar transformation of variables may be regarded as a transfor- 
malion of coordinates in a change of basis (see Chapter II). 

2. Proof of the theorem. A quadratic form in one variable always 
has the canonical form For the hypothesis of induction we 

assume that any quadratic form in (n —1) numerical arguments 
can be reduced to canonical form by a nonsingular linear trans¬ 
formation of (n — 1) variables. 

We consider an arbitrary quadratic form f{x) in n numerical 
arguments: 

/ W = Z (hiXiXi 

Using the induction hypothesis, we will prove that the quadratic 
form can be reduced to canonical form by a nonsingular linear 
transformation of n variables. Two cases are possible. 

First case. In the quadratic form f{x) at least one of the coeffi¬ 
cients On of the squares of the variables is different from zero. 
Without any loss of generality we can assume that Ou ^ 0. We 
set up the following linear transformation with respect to the given 
coefficients of the form fix): 

y\ =aii-'^i + • • • +ai„J^„ I 

1/2 = -V'o j (2) 

//,.= x„ I 

Denote by Q, the matrix of this transformation: 



Oil 

U\o • 

• a\n 

Q = 

0 

1 . 

. 0 


0 

0 . 

. 1 


The transformation (2) is nonsingular since det Q = aw ^ 0. Also 
note that the nonsingularity of transformation (2) follows from its 
one-to-oneness, which in turn is immediately apparent from for¬ 
mulas (2). 

Square the expression y\ and divide by On ^ 0: 

'«7r ••• “^■^In-’^n)^ 

^11 vj-f-20,2X1X2-f- ... 20 |„x,x^-j-(p (X2, ..., x^) 

where (|i is a quadratic form in the arguments X 2 , ..., x„, that is 
to say, t() does not include Xj. Now let us introduce another quad- 
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ratic form if in the same arguments X 2 , ..., x,,, setting 

If (^2 . x„) = g (X 2 .— qp ^ 2 . .... x„) 

where g(x 2 , .... Xn) is given in the notation of f{x) in (1). Then 

we get 

= . >^n) 

or, what is the same thing, 

+ yn) 

By the induction hypothesis, there is a nonsingular transforma¬ 
tion of « — 1 variables 

n 

2* = S RiiUh k = 2, n (3) 

/=2 

which reduces the form to canonical form: 

^ ('/ 2 > ••’.'/«) = ^ 22^2 + • • • + bnn^l 

We complete the transformation (3) so that all n variables par¬ 
ticipate. Namely, we put 

Zi=yu 

Z 2 = ^22f/2 + • • • + Rlnyrt’ 


Rn2y2 ”i" ■ * ■ "H Rnnyn 

We transform the variables X\, .... x,, into the variables 
«/i, ..., «/„ by formulas (2), and then transform the variables 
yi, ..., yn by formulas (4) to get a transformation of the variables 
Xu ..., Xn into the variables 2i, ..., 2 ,,. which reduces the original 
quadratic form to the canonical form 

= + ••• 

This transformation is nonsingular since it is the product of non¬ 
singular transformations (2) and (4). 

Second case. All diagonal coefficients a,, in the quadratic form 
\(x) are zero. Then the foregoing does not apply. But one of the 
nondiagonal coefficients is nonzero. Let it be 012 . Then the quadra¬ 
tic form has the form 



/ (x) — 2ai2A:,A:2 - 1 - ... 


(5) 
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I2'» 

Make the transformation 

X, == Xt 4- X-,, 

X2 = Xt— Xi, 

X3 = X3, ( 6 ) 


Xfi Xf^ } 

The transformation is one-to-one and, hence, is nonsingular. 

Substituting the quantities of (6) into the quadratic form (5), 
we get 

f{x) = 2a,,x]-2a,^l+ ... (7) 

The term 2a,cannot vanish when collecting terms because all 
terms of the quadratic form that are not written out in expression 
(5) do not contain the product ^ 1 X 2 and cannot yield x] via trans¬ 
formation (6). 

Furthermore, the quadratic form (7) can be reduced to cano¬ 
nical form by a nonsingular transformation since we have the 
first case here: the coefficient of Jcj is nonzero. This finishes the 
reasoning by induction and hence the proof. 

3. Remark. From the proof it is evident that a quadratic form 
with real coefficients can be reduced to canonical form by a non¬ 
singular linear transformation which also has real coefficients. 

§ 6. The normal form of a quadratic form 

1. Suppose a quadratic form f{x) is reduced to the canonical 
form 

r 

( 1 ) 

whore Oh, ..., a,r ¥= 0 and r is the rank of f(x). 

2. Suppose we are dealing with a complex space and allow for 
the use of linear transformations with complex coefficients. Set 

yi = -y/aiiXi if i<r, I ^ 2 ) 

yi= Xt if r > r J 
From (1) and (2) we get 

fix) = y]+ ... +yj (3) 

assuming that . . . yr+i, ..., i/n are the new components 

(coordinates) of the vector x. The expression (3) is said to be the 
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normal form of the quadratic form f{x). Noticing that the trans¬ 
formation (2) is nonsingular, we draw the following conclusion. 

In complex space, any quadratic form may be reduced to the 
normal form (3) by means of a nonsingular linear transforma¬ 
tion. 

3. We now confine ourselves to real spaces and real linear trans¬ 
formations. Taking into account that there may be negative coef¬ 
ficients among the coefficients a,-,-, we get 

yi = ^/\a^\ Xi if I 

Pi = Xi a i> r ) 

If the first k coefficients an are positive and the remaining are ne¬ 
gative, then from (1) and (4) we obtain 


f{x) = y]-^ ... -f i/| —... —y] (5) 

Expression (5) is also called the normal form of the form f{x). 
Thus, in real space, any quadratic form may be reduced to the 
normal form (5) with the aid of nonsingular real linear transfor¬ 
mations. 

4. In the next section we will prove that in real space the 
* number of positive and negative terms in (5) does not depend on 
which particular (real) transformation is used to reduce the quad¬ 
ratic form to normal form. 

§ 7. The law of inertia of quadratic forms 

1. Suppose in real space we have a quadratic form of rank r: 

fM=^E (likXiXk 

where {x,} are the components of the vector x relative to a certain 
basis Fi, ..., e„. 

Let El, ..., e,, be a basis in which f(x) has the normal form: 

/W = '/]+ ... +yl — yl+,- ... —yf (1) 

Here, (i/,) are the components of the vector x relative to the basis 

€\f . . . , 

2. The number of positive and the number of negative terms in 
formula (1) go respectively by the names positive and negative 
index of the form; the difference between the positive index and 
the negative index is called the signature. 
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3. Theorem (law of inertia of quadratic forms). The positive 
index and the negative index are invariants of a quadratic form, 
that is to say, they are independent of any choice of basis in which 
it has a normal form. 

Proof. Let there be another basis e\ ..., e„ relative to which 
the form f(x) has the normal form 

f{.K) = z]i- ... + 4 - 4 +,- ... - 2 ; ( 2 ) 

where { 2 ,} are the components of x relative to the basis e\, .... e„. 
It is required to prove that 

k = m 

Suppose that k # m, for instance, that k > m. We consider the 
formulas for transformation of coordinates: 

= E QiiUi (3) 

Note that the matrix Q of coefficients Q,j is nonsingular. 

Substituting (3) into (2), we should get (1). Hence we have 
the identity 

2]+ ... - ... -22 

••• +yl-yl+i- ••• -y^r ( 4 ) 

which is true for any yu ..., yr, yr+i . yn assuming that 

Z\, ..., 2 „ are expressed in terms of i/i.i/„ with the aid of (3). 

Let us set up the following auxiliary homogeneous system of 
equations: 

Qiif/i + ••• -f Qiftt/* =0, I 

. (5) 

Qmif/l + • • • + Qritklh — 0 j 

The number of unknowns in (5) is greater than the number of 
equations because of the assumption that k > m. Therefore, the 
system (5) has a nontrivial solution yi, ..., yu- Substitute this 
solulidii into the identity (4) with the condition that 

i/,v+,= ... =r/, = f/,+|= ... =y„ = 0 (6) 

Then, taking into account (3), (5) and (6), we get 

'/i+ ••• +i/i = -4+,- ••• -4 (7) 

But this is impossible since the left member of (7) is strictly posi¬ 
tive whereas tlie riglit member is either negative or equal to zero. 
Heine, k cannot exceed m. In similar fashion it is established 
that m cannot exceed k. Therefore k = m, and the theorem is 
proved. 




§8| 


JACOBIS METHOD 


127 


§ 8. Reducing a quadratic form to canonicai form 
by Jacobi’s method 


1 . Given a quadratic form f(x) whicli is written out in the com¬ 
ponents of some basis C|, ..., e„: 


Recall that 


f (x) = a(x,x) = 'Z OikXiXk 

^ik ^ (^/» ^k) 


Form the matrix of the quadratic form f(x): 


Oil 

0|2 

0|3 • 

• Oi„ 

^^21 

O22 

a,3 . 

• 02,, 

O31 

O32 

033 • 

• 03 ,, 

^n\ 

0„2 

On3 • 

• 0 „„ 


Now consider the so-called principal minors of A: 

Cl\ I ^12 


^1 — ^ll> ^^2 — 


^21 ^22 


^3 


^11 ^12 ^13 

021 <^22 ^23 1 • • • 1 

^31 ^32 ^33 


A„ = det A 


( 1 ) 


Also, for convenience, we introduce the quantity Ao assuming 
Ao = 1 • 

The Jacobi method is based on the assumption that all principal 
minors of the matrix A are nonzero: 

Ai =7^ 0, A 2 =7^ 0, ..., A„ 0 (2) 


We then seek a special new basis such that 

^2 = ^21^1 ^22^2 > 

~ ^k\^\ “I” ^k'l^l + • • • + ^kk^k’ 

= ^nl^l + ^02^2 + • • • + • • • + ^nn^n 

In order to reduce the quadratic form fix) to canonical form, 
it suffices, for any k {1 C k ^ n), io ensure the conditions 

^'k)~ f = f I 2 , ..., k — I 


( 4 ) 
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Tlicn the a[^ will also be equal to zero (because the matrix of the 
quadratic form is symmetric), and only the coefficients of the 
squares of numerical arguments will turn out to be different from 
zero. 

2. Observe that to fulfil conditions (4) it suffices to require that 
the following equations hold true: 

a(e,, e') = 0, / = 1, 2, ..., ^ — 1; fe=l, 2, (5) 

Indeed, from (3) and (5) we have 

a «, e;) = + ... + e',) 

= P,ia(e,, e;)+ ••• + Pifi{e., e';) = 0 

To simplify subsequent derivations add to (5) the supplementary 
equation 

( 6 ) 

3. When ^ = 1, the conditions (5) vanish and only (6) remains, 
from which, taking into account the first row of (3), we get 

1 =o(e„e;) = P,,a(e,, e,) = P„a„ 

whence 



since a^^ 

Taking into account the notation of (1), we can write 



4. From now on we argue by induction. Assume that all coef¬ 
ficients appearing in the first k —1 rows of formulas (3) have 
been determined. To find the coefficients appearing in the ^th row, 
we write the conditions (5) and (6) together: 

“ (^r ~ • • • > ^ ~ ^ (^le’ ~ ^ 

From this, using (3) we get the following system of equations for 
the desired coefficients: 

a||P*| -h Ol2Pk2 + ••• -h^lkPik =0, 

2^*2 + ••• + a;fe_| ifeP/t* = 0, 

ci/t’.Pki +«*2^ft2 + ■•• +aiikPkk =• 

The determinant of system (7a) coincides with A;, and is non¬ 

zero because of assumption (2). The desired coefficients Pai, ..., 
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Phh will therefore be found. It remains to verify that the con¬ 
structed transformation is nonsingular. With this in mind we find 
the coefficient P^h from system (7a). Applying Cramer’s rule, we 
obtain 



«ii • • • au-i 0 

. _ \-i 

flft-i I ... n^.-i )t-| 0 Aft 

^k\ ••• flftft-i 1 


( 8 ) 


Then, using the triangular structure of the matrix of transforma¬ 
tion (3), we find the determinant D of that matrix: 




'^0 ‘\| 
A, ■ Az 


■V.-I _ I 


Thus, D ^ 0 and, hence, the transformation (3) is nonsingular. 


5. Now we can determine the coefficients of the quadratic form 
in the new basis ej, ..e'. All we need to do is compute the dia¬ 
gonal coefficients since all the others are zero anyway. Utilizing 
(3), (7) and ( 8 ), we find 

^kk ~ ^ (Sk' ^k) ~ ^ (^41^1 "i" • • • ”1" ^k) 

• ~ ^44® (^4’ ^k) ~ ^44 ~ A4 


Hence, relative to the basis constructed by the Jacobi method. 


= + ■■■ 




■«r 


§ 9. Positive definite and negative definite quadratic forms 

1. In this section we consider only real spaces. 

Given in a linear space, possibly infinite-dimensional, a quad¬ 
ratic form f{x). 

Definition 1. The quadratic form f{x) is said to be positive de¬ 
finite if f(x)'> 0 for all a: = 7 ^ Q. 

Note that /(0)=O always. Indeed, since 0 = 0-2 and f{x) = 
= a(x, x), where 2 is an arbitrary vector, a(x, y) is a bilinear 
function, it follows that 

/ (0) = a (0 • 2 , 0 • 2 ) = 0 • o ( 2 , 2 ) = 0 

Definition 2. The quadratic form /(a) is said to be negative de¬ 
finite if /(a) < 0 for all a #= 0. 

It thus suffices to consider positive definite forms since negative 
definite forms are obtained from the former by a change of sign. 


5—661 
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2 . Confining ourselves to quadratic forms in finite-dimensional 
(n-dimensional) spaces, we first of all note a series of simple ne¬ 
cessary features of positive definiteness. Suppose, relative to some 
basis ei . e„, we have a quadratic form 

f(x) = a{x, x) = 'ZaikXiXk 


Recall that an, = a{ei, Ch). 

(1) If /(a;) is positive definite, then a,, > 0 for all < = 1,2. n. 

Proof. 

aii = a{ei, e,) = /(e,) > 0 


Remark. This condition is not at all sufficient for the form to be 
positive definite, Here’s an example. The form 

f{x) = x\+ 1000A:,.v,,-f 4 

has a,, = 1 > 0 , but on the vector (— 1 , 1 ) it assumes a negative 
value. 

(2) If the form f(x) is positive definite, then the determinant of 
its matrix is positive: 

A = det >4 > 0 


To prove this, we reduce f{x) to canonical form. Let e\, ..., e' 
be a canonical basis, that is, a basis in which f{x) is of canonical 
form: 

f(x) = a\,(x\fi- ... 

According to the preceding characteristic, all a'f > 0. 

Denote by A' the determinant of the matrix of the form f(x) in 
the canonical basis. We have 


A' 


0 


0 


= a, 


<„>o 


On the other hand, by formula (4) of Section 3, 

A' = A (det Py 

Hence A > 0. 

Remark. Neither is this condition sufficient for the quadratic 
form to be positive definite. An example: the form 

f{x) = — x\ — xl 

has A > 0, but /(ac) ^ 0. 
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(3) In ra-dimensional space every positive definite form has 
rank n. The proof follows from the inequality A #= 0. 


3. Theorem (Sylvester’s criterion). For a quadratic form to be 
positive definite, it is necessary and sufficient that alt the principal 
minors of its matrix be positive. 

Necessity. Let the form f{x) be positive definite. Take an ar¬ 
bitrary basis e\ .e;,.e,, and construct the linear hull 

L{eu .... Cit). Now consider the quadratic form f{x) not on the 
whole space but only on the subspace L(ei, ..., ei,). 

If x^L{ei, ..., Cft), then .v = {.ri, .... x^, 0, .... 0} and 

k 

fW=.Z dtjXiX, 


All the remaining terms whose coefficients have one of the two 
indices greater than k vanish because of the zero values of the 
coordinates. 

On the subspace L{e\, ..., Ch) the form f{x) is positive definite 
since it is positive definite on the whole space. Therefore the de¬ 
terminant of the form f{x) considered on L{eu ..., e*) is positive: 


— 


a,I 






^kk 


> 0 


But Aft is a principal minor of order k of the matrix of the 
quadratic form f{x), and the index k can assume the values 
1,2, n. The necessity proof is complete. 

Sufficiency. Let Ai, > 0 for ^ = 1, .... n. 

Reduce the quadratic form to canonical form by the Jacobi me¬ 
thod. This yields 


••• +^i<y 


^n — i 


U X Q, then at least one of the coordinates x'^ 0, and, hence, 

f(x) > 0, which completes the proof of the theorem. 


4. Take note of the two-dimensional case. Let 
f = ax- 2bxy + ci/ 

where this time the numerical arguments of the form are denoted 
by X, y. 

Sylvester’s condition reduces to the inequalities 


a > 0, 


a b 
b c 


ac — b^ > 0 


5 * 
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Quite naturally, in the two-dimensional case, Sylvester’s criterion 
can be established without any special theory since for positive 
definiteness it is necessary that a > 0, and for a > 0 we have 

f = [{ax + btjf -f (flc — 6^) 1 / 2 ] 


§ 10. Gram’s determinant. The Cauchy-Bunyakovsky inequality 


I. Suppose that in an arbitrary linear space L (possibly infinite¬ 
dimensional) there is given a quadratic form f{x)=a{x, x) and 
a finite system of vectors pi . p/,. 

Definition. The Gram determinant for the quadratic form a{x, x) 
and the system of vectors p], ..., ph is defined to be the quantity 


G{pu 


• Pk) = 


a{pi. Pi) .. 

• a(pi, Ph) 

aiPh.Pi) .. 

• a{pk, Pk) 


Determinants of this kind are frequently encountered in mathe¬ 
matical physics and the theory of integral equations. 


2. Theorem. Let the space L be real and the quadratic form 
a{x, jf) positive definite. Then G(pt, ..., Pi,)>0 if the vectors 
P\, Ph are linearly independent. If the vectors pi, ..., ph are 
dependent, then G(pi, ..., pi,) = 0. 

Proof. (1) Let the vectors pi, ..., pu be linearly independent. 
Then they will constitute a basis in their linear hull L(p\, .... ph). 
An arbitrary vector x e L{pi, ..., ph) may be written as 

x = Xipi -f ... -\-XhPk 

We will consider f{x) on vectors of L(pi, ..., ph). With respect 
to the basis p\, .... pt, we have 

k 

f{x)= S n,-,v,.v/ 

i. /=! 


(even if the original space L is infinite-dimensional). 

Since f(x) is positive definite on the whole space L, it is also- 
positive definite on the subspace L{pi, .... pi,) so that 


A,= 


On . 

• aih 

«AI • 

• • akk 


(1) 


Nolo lhal a, , aiPi.Pj), whence and also from (1) 
G{P\, •••. Pk) = ^k>0 
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(2) Now let pu .... ph be linearly dependent. Then there will 
be scalars A,i. Xt,, not all zero, for which 

KP\ + • • • + "^kPk — 0 

Note that 

a {x, 0) = 0 

and substitute 

X = P{, 0 = >.,/),+ ... -f 

into this identity. Assigning the values 1, ..., A: to i, we get a ho¬ 
mogeneous system of k linear equations in k unknowns: 

^\a{pi, Pi)+ ••• +ha(.Pi, Pk) = 0, ] 


ha(pk,Pi)+ ••• Pa) = 0 ) 

This system definitely has a nontrivial solution A,i. %h and 

therefore its determinant is zero: 

G(/)i, .... Pk) = 0 

The proof of the theorem is complete. 

3. An important special case. Within the framework of this theo¬ 
rem, let us consider a system consisting of two vectors pu p 2 . We 
have 

a(pu Pi) a(Pi,P 2 ) 
a(P 2 . Pi) a(p 2 . P 2 ) 

Expanding this determinant and taking into account the sym¬ 
metric nature of the bilinear form, we obtain the inequality 

[« (Pi. P 2 )f < a (Pi, Pi) • «(P 2 . P 2 ) (2) 

which is called the Cauchy-Bunyakovsky inequality. Equality oc¬ 
curs if and only if the vectors pi and P 2 are linearly dependent. 

4. Let us consider the space of continuous functions specified on 
some interval ti ^ t ^ ^ 2 - In this space we consider the quadratic 
form 

t, 

fM=\ UitWdt 

<■ 

(in this connection, see Section 4, Subsection 4) 

The polar bilinear form of f{x) is 

t, 

a (•«. P) = 5 (0 P (0 dt 

u 
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It is readily seen that f{x) is positive definite. Indeed, if the con¬ 
tinuous function x{() is not identically zero, then 

t, 

5 \x{t)fdt>0 

I, 


Therefore, in this case inequality (1) can be used. We thus get 
the Cauchy-Bunyakovsky inequality for integrals: 



I, r, 

< 5 [x{t)fdt X ^ [yitWdt 

I, t, 


(3) 


lujuality in (3) occurs if and only if the system x(t), y{t) is li¬ 
nearly dependent or, to put it more simply, if one of the functions 
^(0. y{t) is proportional to the other (say, y{t)= Cx{t), C con¬ 
stant). 

This example shows that algebraic theorems operate outside the 
domain of algebra proper and make it possible to obtain results 
from analysis. The general basis of such applications is the con¬ 
struction, in infinite-dimensional function space, of finite-dimen¬ 
sional linear hulls. 


§11. Zero subspaces of a bilinear and a quadratic form 

1. Let a{x, y) be a bilinear form given in a space L. 

Definition 1. We will use the term right zero subspace of the 

form a{x,y) for the set of all elements y for each of which, given 
that any x ^ L, the equation 

a {x, y) = 0 (a) 

holds true. This definition clearly does not depend on the dimension 
and can be used in the infinite-dimensional case. 

We denote the right zero subspace by Lo. 

In similar fashion we define the left zero subspace L'u, namely; 
y e if a (//. x) = 0 for any x e /,. 

2. First of all we will prove that L'u is indeed a linear sub¬ 
space. Let //i, y.,^L'u. Then a(x,//i) = 0, a{x,y^ = i) for any x; 
but then it follows that 

a (x, //, -f- 1 / 2 ) = a (x, y,) -f a (x, y^ = 0 
(i(x, af/,) = aa (x, //i) = 0 

Thus, //| j- y., <= IZ and a//| s L'u 

In (|tiile analogous fashion we prove that L'u too is a subspace. 
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3. We now consider an /r-dimensional space, in which we fix a 
basis Cl, ..., c,i and write down the bilinear form in coordinates: 

a (x, y)='Z anXiyi 

where aij = a(,ei, Cj). 

We will show that y e L'u if and only if 

Za,;i// = 0 (1) 

for all values of i (i = 1, 2, .... n). To do this, we write down the 
identity 

Z «//v,i// = Z ( Z «,/.'//) (P) 

If the conditions (1) hold, then so also does (a). On the other 
hand, from the condition (a) it follows that all the coefficients of 
Xi in the right member of (p) vanish, which is what gives us the 
system (1). 

In exactly the same way, y^L'o if and only if 

Z aiitji = 0 (2) 

for all values of / (/ = 1 , 2 ,..., n). 

Equations (1) and (2) are systems of equations defining L" 
and L'\ in terms of coordinates. 

By the theorems on linear systems (see Chapter III), (dimen¬ 
sion of LoO = (dimension of Lu) = n — r, where r is the rank of 
the bilinear form a{x, y), that is, the rank of its matrix. 

From this it follows that the rank of a bilinear form may be 
determined geometrically, namely: the rank of the form a(x, y) 
is equal to the difference between the dimension of the entire space 
and the dimension of the zero subspace of this form (it is immate¬ 
rial which zero subspace — right or left — is taken since their di¬ 
mensions are the same). 

4. Definition 2. A bilinear form is said to be nonsingular if the 
dimension of L'o (or Lo) is equal to zero. In all other cases the 
bilinear form is singular. 

In other words, a bilinear form is singular if its zero subspaces 
have a nonzero dimensionality or, what is the same thing, if its 
rank is less than the dimension of the space: r < n, or if the de¬ 
terminant of its matrix is zero: A = dot 4=0. 

5. Suppose the bilinear form a(x,y) is singular, that is, its 
rank r < n. We introduce into this space a special basis e\, $ 2 , ..., 
e„ such that er+i...., e L". To do this, it is necessary first to 
choose linearly independent vectors in L'o (their number is 
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precisely equal to the dimension of L") and then complete the 
basis for the entire space. For such a choice of basis, let us see 
what the matrix of the bilinear form will look like. If / = 
= r + I. n, then by the definition of LZ 


Thus 


aii = a{ei, e,) = 0 



flu •• 

• a\r 

0 . 

. 0 

A = 

021 • • 

• ^2r 

0 . 

. 0 


Cln\ • • 

• ^nr 

0 . 

. 0 


The matrix is simplified — the more so, the lower the rank of the 
bilinear form (the higher the dimensionality of the zero subspace). 

If the basis is chosen so that the last n — r basis vectors are in 
the left zero subspace, then this too will lead to a simplification 
of the matrix, but this time not the columns vanish, but the last 
n — r rows: 


Oil . 

* * 

an . 

• ^rn 

0 . 

. 0 

0 . 

. 0 


Let the bilinear form be symmetric. Then L'o coincides with Lq 
(prove this). Now place the basis vectors er+t, ..., in L'o. 

These same vectors will be in L'o and the matrix will become 
particularly simple in aspect: 


«ll • 

• ^\r 

0 . 

.. 0 

On • 

. . Orr 

0 . 

. 0 

0 . 

. 0 

0 . 

. 0 

0 . 

. 0 

0 . 

. 0 


Thus, if the rank r of a symmetric bilinear form is less than n, 
then its consideration fully reduces to a subspace of dimension r 
(spanned by Ci.Cr). 


(). Now let us consider the quadratic form f(x)z=a{}c,x). 
Definition 3. The zero subspace of a quadratic form is the zero 
siihspace Lo of its polar form a(x, y). 

Tliere is no need lo distinguish between L'o and L'o since L'o — L'o. 
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If the quadratic form is nonsingular, then its zero subspace is 
zero-dimensional. 

If the form is singular, then its rank r < n and Lo has dimen¬ 
sion n — r. Relative to the basis . .. e„, the vectors Cr+i, ..., 

of which are placed in Lo, the matrix of the quadratic form is of 
the form (3), and the form f{x) may be considered in the r-dimen- 
sional subspace L{ei, 

§ 12. The zero cone of a quadratic form 

1. Besides the given linear space L we consider the affine space 
91 assuming that the elements of L are the radius vectors of points 
in 91. 

2. Definition. A set of points in affine space is called a cone 
with vertex O if together with every point M that is noncoincident 
with the vertex 0 the set contains the entire straight line OM 
(for n = 3 see Fig. 24). 



In certain cases, the single point 0 is conveniently regarded as 
a cone consisting solely of the vertex. The simplest cases of cones 
are: any plane passing through the point 0, and also the entire 
space 91. 

3. Let there be given in the space L a quadratic form f(x). It 
may be regarded also in the affine space 91 assuming that the value 
of f(x) at the point M is defined for .v = OM, that is, is equal to 
fiOM). 

Denote by Ko the set of points of the affine space at which the 
quadratic form f(x) is zero (M e /Co if f(OM)= 0). 

Theorem 1. The set Ko is a cone with vertex at the coordinate 
origin. 
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Pruuj. 11 may happen that Ko consists of the single point O 
(for example^ if the space is real and the form f{x) is positive de¬ 
finite). Then llie assertion is true, for we agreed to consider a 
single point a cone. 

Suppose there is a point M = 5 ^ 0 for which 

_ fW = o 

when X = OM. 

Draw through 0 and M a straight line and on it take any point 
M* (Fig. 25). Set OM* = x*. Then x* = Xx, where X is a scalar. 
Consequently, 

f ('■'*) = f (^■'^) = (I ^x) — K^a (x, x) = {x) = 0 

Thus, with every point Af 0 the set Ko also contains all points 
of the straight line OM (Fig. 25). 


Fig. 25 


4. Definition. The set Ko is called the zero cone of the quadratic 
form f(x). 

5. Note tliat, generally speaking, the cone is not a linear sub¬ 
space; if f{x) = 0 , f{y) = 0 , then it may be that f{x + «/)¥= 0 . 

6 . Theorem 2. The zero subspace of a quadratic form is always 
part of the zero cone of that form: 

Lo<=^Ko 

Remark. L(, is defined as a set of vectors in linear space, while 
Ko is defined as a set of points in affine space. 

Tliei efore, wlien speaking about the inclusion Lo c Ko, we have 
to assume that l.o is a point set of the endpoints of the radius vec¬ 
tors (if a zero sulispace. 
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Proof. Let y ^ Lq, ij = OM. Then a{x,y) = 0 for any x. Set 
X = y to get a{y, y) = f(y) = 0. Hence, y ^ Ko in the sense that 
M e Ko, where M is the terminus of the vector y. 

§ 13. Elementary examples of zero cones of quadratic forms 

1. Let us consider in more detail some particular cases encoun¬ 
tered in elementary analytic geometry. We will assume that the 
quadratic form is not equal to zero identically and has been re¬ 
duced to normal form. 

2. The real plane (n = 2). 

(1) f(x) — xf — xp Here, r = 2 and Lo is of zero dimension. 
Hence, Lo consists of the zero point alone. The zero cone Ko is de¬ 
fined by the equation x^^—x-, — 0 and decomposes into two straight 
lines: Xi = X 2 , Xi = — X 2 . Due to the small dimension, the cone is 
not a surface but is a line consisting of two intersecting straight 
lines (Fig. 26). 



(2) f (x) = x]xl. Again, r = 2 and Lo is of dimension 0. The 
zero cone is defined by the equation x~-\-x?^ = 0 and consists of a 
single point. We sometimes say that such an equation defines an 
imaginary cone. 

( 3 ) f{x) = x-y Here, r = I and Lo is of dimension 1. The cone 
Ko is defined by the equation X] = 0. Hence, Ko consists of points 
for which Xi = 0. It is readily seen in this case that Lo must coin¬ 
cide with Ko- Indeed, Lo is of dimension 1, and by the foregoing Lo 
must be included in the zero cone so that Lo will be the sole 
straight line Xj = 0 that is contained in Ko- The cone Ko is defined 
by a second-degree equation. In the case at hand we say that 
every point of the axis X 2 of Ko has to be counted twice. 

In Case (3), only one square, x'j , participates. This is because 
the basis vector ej is placed in the zero subspace. 

3. Three-dimensional real space (n = 3). 

( 1 ) f (x) = x^xi — x'^. Here, r = 3 and Lo is of dimension 
zero. The zero cone is defined by the equation xj -f- x'i — X 5 = 0. 
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If (lie space is examined from the elementary point of view, with 
angles, distances, and so forth, then such an equation defines a 
circular cone with axis on the axis Xa and with a right angle 
helween (lie generatrices. 

In this case. Euclidean space serves as a model of a linear 
space. However, one must bear in mind that in linear (and in 
affine) space, angles are not defined, nor is there a rule for 
measuring distances, and for this reason the concept of a “circular 
cone” is meaningless. Still, this does not prevent us from using 
Euclidean space as a model for a linear (or affine) space. The 



supplementary properties of Euclidean space only help to make 
the descriptions pictorial. 

(2) f (x) = x] x]xl- Here, r = 3, Lo has dimension zero, 
and Ko is defined by the equation x]-\- xl + xl — 0. This is an 
imaginary cone. In real space it has only the zero point. 

(3) / (x) = -j-x^. Here, r — 2 and Lo is of dimension 1. Thus, 

Lo is a one-dimensional linear subspace, that is, a straight line 
passing tliiougli the origin of coordinates. The cone Ko is defined 
by the eiiuation x'f-l-x^ = 0 and consists of points of the form 
(0, 0, X 3 ); in other words, it is the set of points of the third axis. 
Since LoCzKo, it is clear that Lo is the same straight line (the 
third coordinate axis). The only thing to bear in mind is that in Ko, 
every point of this straight line is counted twice, not once. 

Note that the third basis vector 63 is placed in Lo and so in the 
representation of the form everything that is connected with the 
third coordinate lias vanished. 

H) / (x) = ,V] —-ir;. Here, r = 2 and Lo is of dimension 1. The 
cone Ko is defined by the equation x^ —xj —0. The left-hand 
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member of this equation can be factored into two linear factors so 
that the cone Ko consists of two planes. We will consider Euclidean 
space as the model of the linear space. Then Ko is depicted in the 
form of a pair of planes that pass through the axis xz, intersect at 
right angles, and intersect the plane Xo = 0 along the bisectors 
of the quadrantal angles (Fig. 27). 

In this example, the subspace Lo may he found in two ways; by 
computation and by geometrical reasoning. Let us consider them. 

We write down the polar bilinear form and equate it to zero: 

XiUi — X2IJ2 = 0 

It is possible to find y =(y\, t/ 2 , y-i) for which this equation holds 
true for any x = (xi, X 2 , X 3 ). It is clear that </i = </2 = 0 and yi 
can assume arbitrary values. Thus, Lo coincides with the third coor¬ 
dinate axis. 

It is not possible, directly, to obtain this result geometrically, 
as was done in the preceding example. We know that Lo is a 
straight line that passes through the origin, but there are many 
such lines in Ko, and it is not possible at once to isolate one of 
them as the space Lo. 

We can however approach this differently. Note that the quad¬ 
ratic-form notation does not involve the third coordinate. This 
means that the third basis vector is placed in Lq. It then follows, 
♦)y the one-dimensionality of the zero subspace, that Lo coincides 
with the third coordinate axis. 

( 5 ) f{x) = x]. Here, r = 1, Lq has dimension 2 , the cone Ko has 
the equation xj = 0 and is the plane Xi = 0 (taken twice). Geo¬ 
metrically, Lo is that same plane Xi = 0 . 

4. Remark. We have just considered all versions that may be en¬ 
countered when studying Lo and Ko in two-dimensional and three- 
dimensional real spaces. Indeed, an arbitrary quadratic form may 
be reduced to canonical form and then, if need be, multiplied by 
— 1. In this way, the matter at hand will reduce to one of the 
cases considered above. 
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§ I. Reciprocal bases. Contravariant and covariant vectors 

I. Let L be an n-dimensional linear space and L* the space 
conjugate to it (that is, the space of all linear forms specified on 
L\ see Section 1 of Chapter IV). In L we introduce an arbitrary 

basis Cl.e„. The coordinates (components) of an arbitrary 

vector jc in L in the basis ei,..., will be denoted by {x',..., x”}. 
In the conjugate space we choose a basis e'{x}, ..., e’'(x) so that 
the values of the linear forms e'(x) on the vectors e, form a unit 
matrix: 

e‘{e^) = 6‘ ( 1 ) 

where b‘ is the Kronecker delta (6;-=l for i = j and 6/ = 0 for 
i = 5 ^ /)• 

Definition. The basis e'{x), e’'{x) in L* that satisfies the 

conditions (1) is said to be the reciprocal of the given basis 
e\ . Sn in L. 

From the definition it follows that for any basis there exists a 
unique reciprocal basis and that it is given by the formulas 

e- (x) = 0 • a:' + 1 • + ... -f 0 • x", I 


e" (x) = 0 • x' + 0 • .X-' 4- ... + I • -v" ) 

2. In the space L let us pass over to a new basis e,'.e„'. 

I'or the sake of convenience in notation, we will write formulas (I) 
of Section 5, Chapter II, somewhat differently: 

= (I) 

Here and henceforth we will prime all indices referring to a new 

basis; no other special meaning is given to the symbols T, 2 ' . 

ri', so that 1' I, 2' = 2, ..., n' — n. In the matrix P, the upper 
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index varies along each row and the lower index varies down each 
column; 



Let an arbitrary vector x in L be resolved in terms of the old 
basis and the new basis: 

x = x'e, + ... +x"e„ = x'e,. 4- ... +x'‘'e^' 

We will write formulas (III) of Section 5, Chapter II, expressing 
the new components in terms of the old ones thus: 

x'' = EQJV (II) 

We use Q to denote the matrix of coefficients of the right-hand 
members of (II). Note that we have to regard the upper index as 
varying according to column and the lower index as varying ac¬ 
cording to row: 


O' 

Qi' .. 

. qV 

Qi 

Q]' .. 

. Qk 

qT 

Ql' .. 

. q;i 


For the given arrangement of indices, both in the matrix P and 
the matrix Q, the primed index denotes the number of the row, the 
unprimed index the number of the column. The equations 

Q = (PT', P = (Q*)-' (2) 

hold true, and formulas (4) of Section 5, Chapter II, become 

Zp^Q'/'=6/. ZP'i'Q}=(>r (3) 

We will henceforth make frequent use of relations (2) and (3) 
without stipulating this in any way. 

3. In the conjugate space L* we take a basis e''(x), ..., e"'(x) 
that is reciprocal to the new basis in L, that is, such that satisfies 
the conditions 

e^'(C;,) = 6.: (4) 

Let us find the formulas for passing from the basis e^(x) to the 
basis e*'(x). Relations of the form 

U) = Z t’* (.>f) 


(5) 
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with certain coefficients At definitely hold true. The matter at 
hand therefore reduces to computing the coefficients At from the 
given P‘i'. From (4), (5), (I) and (1) we have 

t;: _ t.- = 2 Ke’‘ ^ AXi' (Z P\.e) 

= I.KP‘,^{e,)=ZAXP‘,f.\ 

i. It t, k 

Hence 

= ( 6 ) 

a 

We denote the matrix of the desired coefficients of the right sides 
of (5) by A and assume that the primed index denotes the number 
of the row and the unprimed index denotes the number of the co¬ 
lumn, that is, the variation is by row. Then all equations (6) are 
equivalent to a single matrix equation: 

AP*=E (7) 

where, as usual, the asterisk denotes the transpose. From (7) we 
get 

/l = (Pr' = Q 

Hence 

W = (I*) 

4. Let an arbitrary linear form u{x), that is, an element of the 
space L*, be resolved in terms of the old and the new basis: 

U (X) = M,e' (X) -f ... -f (.V) = (x) + ... + Un-e'*' (Af). 

Let us now find the formulas that express the new components of 
the form u(x), that is the coefficients of the resolution of u(x) in 
terms of the basis e‘' {x), in terms of the old components. To do 
this, recall that the coefficients of the desired formulas constitute 
a matrix which is the transposed inverse of the matrix of for¬ 
mulas (1*). But inversion and transposition of the matrix Q 
yields P. Thus 

Ur = LP‘M, ( 11 *) 

5. We see that the formulas (I*) and (II*) are obtained from the 
familiar formulas (1) and (11) if we interchange the roles of mat¬ 
rices P and Q. 


(). I'or greater jrictorialness we give the following scheme. 
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In L space 

e' (ej) = 6j 
e‘' (e,.) =b\', 

In L’ space 

e,- = 2 P\,ei 

X = x'Ci -f- ... + e L 

x‘' = '£Q’t'x‘ 

x = x''e,,+ ... +x”'e„.eL 

Q = (/>)-' 

e''(A) = XQr«'W 

« (jc) = /;,e' (a:) -1- ... + Hne" (a) e L* 

u(x) = u^,e' (a) + ... + Uji-e" (a) e 

P = (Q4-' 


7. We use the term contraction of an element a — a'e\ + ■ • • 

+ a"en in L with an element u{x) = U\e' (a:) + ... + (x) in L* 

to signify a number denoted by (a, u) or (m, a) and defined by the 
equation 

(a, u) = u^a} + ... + u„a'' == I] u^a'‘ 

A contraction is obviously an invariant since it is nothing but the 
value of the invariant form u{x) = U\X^ + ... + UnX^ on the vector 
jc = a = a‘ci + • • • + 

The invariance of a contraction can also be derived as a conse¬ 
quence of formulas (II) and (II*). Indeed, 

= S f Z P'k'Q!\ 'j UaO,^ — 2^ 63«ua® = Tj 

a. p V ft' / a, p ft 

8 . It is clear that a contraction possesses the following two pro¬ 
perties: 

(A) When m or a: is multiplied by a scalar, the contraction 
(m, jc) is multiplied by that scalar: 

(all, x) = (h, ax) == a (u, x) 

(B) A contraction is distributive with respect to addition: 

(h + x) = («, x) + («', x), 

(u, x + x') = (ii, x) + («, x') 

9. Note the complete symmetry in the interrelationships of L 
and L*. We consider the contraction 

(«, C) = //| V' -f ... 4- 

If here the element u in L* is fixed and a' = {jc', ..., a”} in L 
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varies in arbitrary fashion, then the contraction (u, x) is a linear 
form, taken for u, with numerical arguments x\ a:". And L 
may be regarded as a coordinate space. However, we can just as 
easily regard u as an element of a coordinate space, namely, the 
very element that is defined by the coefficients of the form («, x), 
and we can write u = {«i, ..., «„}. 

Now suppose that the element jc = {jc', ..., jc"} in L is fixed 
and that u = {ui, ..., «„} varies. In that case, the contraction 

(m, x) is a linear form with numerical arguments Ui.For 

the X we can take the form itself instead of the n-tuple {x‘, ...,a:'*} 
of coefficients of the form. Thus, the elements x ^ L are interpreted 
in exactly the same way relative to the elements u^L* as the 
elements « e L* are relative to the elements x e L. In other words, 
if the space L* is conjugate to L, then L may be regarded as con¬ 
jugate to L*. 

10. A symmetry in the interrelationships between L and L* was 
already perceived above when passing from the formulas (I) and 
(II) to the formulas (1*) and (II*) (also see the table in Subsec¬ 
tion 6). One should bear in mind that one of the spaces L and L* 
(namely L) is taken for the original space. This circumstance will 
be seen to affect the terminology discussed in the next subsection. 

11. A transformation by formula (I) with matrix P is called a 
transformation by the covariant law. 

A transformation by formula (II) with matrix Q is called a 
transformation by the contravariant law. 

In the given space L the components of every vector transform 
by the contravariant law. In the conjugate space the components 
of the vectors transform by the covariant law. 

Accordingly, the vectors of the given space L are called contra¬ 
variant vectors and the elements of the conjugate space are called 
covariant vectors. 

12 . In tensor calculus the practice is to use lower indices in the 
ease of the eovariant law of transformation and upper indices for 
the contravariant law of transformation. Accordingly, we used up¬ 
per indices for the components (coordinates) of vectors in L. 

Setting the indices on elements of the matrices P and Q is done 
so llial the summation index which appears twice in the expression 
of the general term of the sum is a subscript in one case and a 
superscript in the other (see table in Subsection 6). If under the 
summalion sign there are free indices on which summation is not 
perfornu'd, then the same indices (upper or lower) are set on the 
qiiaiililies obtained by the summation. These rules h.elp to deter¬ 
mine the Iransformalion laws of quantities obtained by the surn- 
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mation. At the same time, these rules require, for instance, that the 
number-labels of the covariant basis vectors be indicated by upper 
indices, 

13. Throughout this chapter we will assume that the bases cho¬ 
sen in L and L* are reciprocal bases. The basis in L* reciprocal to 
the basis {e,} e L will be denoted by {<?'},thus simplifying the ear¬ 
lier designation e'{x). An arbitrary element in L* will be denoted 
by u (or v, and so forth) in place of u{x) (or o(a'), and so forth). 

14. A new definition of a conjugate space. The concept of conju¬ 
gate spaces may be explained in a somewhat different manner so 
that their reciprocal equivalence will be evident from the very de¬ 
finition. 

Let L and L* be two linear spaces. For the sake of simplicity we 
will assume that they are finite-dimensional and have the same 
dimension n. Suppose that to every pair of elements x ^ L, u ^ L* 
is associated a number; we denote it by {x, u) and call it the 
contraction of the elements x, u if the following properties hold. 

(1) The distributive property with respect to each element: 

{X, U, + U2) = {x, «,) + {x, U2), 

(Xi -f X 2 , u) = {Xi, u) + {X 2 , u) 

for arbitrary x, Xu X 2 e L, u, «i, «2 ^ ■ 

(2) The associative property with respect to multiplication of 
any element by a scalar: 

(av, u) — {x, au) = a {x, u) 

(3) The property of nonsingularity: if Ci, ..., a„ are linearly 

independent in L and (ui, u) = 0, ..., (a„, u) = 0, then u is the 
zero element of L*. Similarly, if bu ..., bn are linearly indepen¬ 
dent in L* and {x, 61 ) =0. {x, b„)=0, then x is the zero 

element of L. 

The spaces L and L* may both be real or both complex. Ac¬ 
cordingly, the scalars are all real or complex. 

Let us denote by ei, ..., an arbitrary basis in L and by 

e‘. e" an arbitrary basis in L*. Let 

By properties (1) and (2) we have 

(x, u) = Z (8) 

where af = (^e^, e*). Thus a contraction is expressed as the bilinear 
form (8). It is easy to see that Property (3), that is, the condition 
of nonsingularity, signifies the nonsingularity of the bilinear form 
(8). It is also readily seen that the contractions (x, u) may be spe¬ 
cified by formula (8) in different ways by arbitrarily assigning 
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llic numbers nf, so long as det aj 0. The conditions (1), (2) 
and (3) will hold true. 

The spaces L and L* are called (reciprocally) conjugate spaces 
if a contraction is specified for them and if they are considered 
together with the given contraction. Given this definition, we can 
now construct for a given L an infinitude of distinct conjugate 
spaces L* (to put it more precisely, spaces differently conjugated 
to L). In order to eliminate this indeterminacy, we will define the 
concept of equivalence of linear spaces differently conjugated to 
the given space L. 

Let us denote by L] and LJ two n-dimensional spaces conju¬ 
gate to L. We will say that they are equivalent in the sense of 
conjugacy to L if they are related by a linear isomorphism such 
that 


(x, u) = (jr, u') 


(9) 


where x is an arbitrary element in L and u is an arbitrary element 
in Li, and u' is the element of L'z that corresponds to u via the 
isomorphism. 

It will readily be seen that all linear spaces conjugate to a gi¬ 
ven L are equivalent. To prove this assertion it suffices to establish 
that if an arbitrary contraction is given for L and L*, then for any 
basis ei, ..., e,, ^ L there will be a unique reciprocal basis 
e', ..., e" in the space L*. In other words, e', ..., e" e L* may be 
found (in unique fashion) so that (e^, = 

To prove this, let us consider an arbitrary basis e'" with the aid 
of which a contraction is specified by the formula (8). We will 
seek the first vector e' of the basis e', ..., e” in the form 

e‘ = Cifi' -f 02^^ -j- ... -f- a„c’‘ 


We must have (e,,e')=l, (^ 2 , e') —0, ..., {e„,e') = 0, whence 
aja, + a^a^ + .. . + a"a„ = 1, ' 
a'a, + ala, + ... + = 0-1 

<«i + «n«2+ ••• - 

System (10) is unambiguously solvable since det a^^O. Simi¬ 
larly, assuming e'^ = Pie‘ + ... -f- p„e'‘, we find e^ from the con¬ 
ditions (ey, e^)=0, (e 2 , e^)= 1, (e^, e^) = 0, .... (e„, e'^)=0. 
Continuing the process, we find all the vectors e', ..., e" e L* 
such that {e^, e'‘) = They constitute an independent system. 
Indeed, let 

* = 0 s Z# 
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Contracting the left and right sides of this equation with the vec¬ 
tor e* e L, we find 

{sk, Z = ifik, e‘) 

or \ = Q{k = \, 2 , ..n) since = 

Which proves that for any basis C|, ..., e„ e L there exists a reci¬ 
procal basis e\ ..., e" in the space L* for any kind of specification 
of conjugacy between L and L*. 

We now prove uniqueness. Suppose that for the given basis 

<?i.e L there are two reciprocal bases in the space L*: 

ej and e,]. We have (^’t. ^'i) = 6^ and ^ 2 )whence 
^'i ~ ~ ® ^ ~ *’ • • • ’ ” J = 1,2, ..., n. 

From this and by the condition of nonsingularity we find e\ —e'=0*, 
or c'=c5. 

Now suppose X = x'e[x"e„ ^ L, « = «ic'-f- ... 
+ M„e" e L*, where e, and e* are reciprocal bases. Formula (8) 
then assumes the form 

(x, u) = x'ui + ... + x''u,^ (11) 

At the same time we have proved the equivalence of all spaces 
conjugate to L. Indeed, suppose L' and L 2 are two spaces conjugate 
to L, e\, ..., 6,1 is any basis in L. e\ ..., e" is the reciprocal basis 
♦n Li, and (e')', ..., (e")' is the basis in Li reciprocal to 
fii, ..., We establish a linear isomorphism between LI and Li 
assuming 

«' = «,(c')'+ ... +u„(er, w'eLi 
as the appropriate element for an arbitrary 

« = Uie' -(- ... -j- uea L‘ 

Then, by (11), 

(.V, u) = (x, n') 

It is now clear that the new definition of conjugate spaces does not 
differ from the earlier given definition. It suffices to notice that 
associated with an arbitrary element weL* is the linear form 

(x, u) = UiX' + ... + 

where «i, ..., u,, are coefficients (the constant components of the 
given vector u e L*). 

§ 2. Tensor product of linear spaces 

1. Suppose we have two linear spaces L and £, both real or both 
complex (possibly infinite-dimensional). Using vectors taken from 
L and L, we will construct some new entities whose set will be 
denoted by T. 
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I'irst of all, for elements of T we will take all possible pairs of 
vectors ab, where a ^ L, b ^ L Besides, there will also be ele¬ 
ments of T consisting of all possible finite sets of such pairs. There 
will be no other elements in T. In other words, any element t 
is of the form 

t = {a^b^, ..ttkbk) ( 1 ) 

where ai, ..., Ok e L, 6,, ..., 6* e £. This equation has only one 
meaning: that an element, denoted by t of the set T is the k-tuple 
{complex) of pairs aibi, ..., ahbh. 

We agree always to write the element of L in the first position 
of a pair. If L and L coincide, then the pairs of vectors that make 
up the elements of T are considered to be ordered, that is, the or¬ 
der of the vectors constituting a pair is essential. Thus, in the 
case L = £, a e L, b ^ L, we have, generally, ab ^ ba. 

2. For what follows it will be more convenient to call the pair 
ab a product of a by 6 and in place of the words “sets of pairs” 
or “^-tuples of pairs” to use the word “sum”. Accordingly, in place 
of (1) we will write 

/ = o,fe, + ... +aA (1') 

where Ci.a/, e L, bu ..., e L. Very often, the pair ab is 

termed the symbolic product of a by b, and the sum (1') is called 
the symbolic sum. It will be apparent later on that this arithme¬ 
tical terminology is fairly well justified. 

3. We now introduce three equivalence conditions for the set T, 
that is, conditions under which certain elements of T are said to 
be equal, namely: 

(1) the symbolic sum does not depend on the order of the terms, 

(2) {a-\-b)c = acbe, 

where a, b are any vectors in L and c is any vector in L. Simi¬ 
larly. 


a {b -f- c) = ab -j- ac, where a e L and b, c^L, 

(.3) {aa)b= a{ab) where a, b are arbitrary vectors taken from L 
and /', respectively, and a is any scalar (real if L and £ are real 
spaces, and complex if these spaces are complex). 

Remark. We have not yet mentioned another condition of equi¬ 
valence that has been taken for granted and should have been 
stated first; that under an admissible replacement of the vectors 

til. b\, .... bi, the element t is subjected to an admissible 

reiilacement, which means it is carried into itself. 
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4. In other words, the conditions stated in Subsection 3 mean 
that admissible replacements of the element 

/ = ai6, + ... +a/iftjer (2) 

are: 

(a) any change in the order of the pairs aib,, ..., aubh in the 
sum (2): 

(b) replacement of a pair by a sum of pairs or the replacement 
of a sum of pairs by one pair in accordance with (2) of Subsec¬ 
tion 3; for example, if a\=a'\-\- a", then the pair aifti in t may be 
replaced by the sum aift, -j- d\b{, 

(c) the transfer of a numerical factor from one vector of a given 
pair to the other vector of that pair. 

At the same time, two elements t\, are said to be equal 

if and only if, by means of a finite number of admissible replace¬ 
ments, they can be reduced to one and the same set of pairs of 
elements taken from L and L. 

5. We now introduce linear operations into the set T. 

(1) The sum of two elements of the set 7", 

/ = ai6i+ ••• cik^k, 

t' — a'\b\ -f- amb'm 

i# an element of the set which constitutes a complex of pairs of the 
element i combined with a complex of pairs of the element i': 

t 1' = <2|6| “1- ... -f- dkbit -|- a'\b'\ + ... -1- a'mb'nt 

(2) The product of an element / by a scalar a is defined by the 
equation 

at = (aa,) 6, -f ... -f (aa*) 6^ 

By Subsection 4, the sum t + 1' and the product at are invariant 
to admissible replacements of the elements t and 

We now prove that the set T together with such linear opera¬ 
tions constitutes a linear space. 

First note that the axioms (1), (2) and (5)-(8) of a linear space 
obviously hold true for T because of the definition of linear opera¬ 
tions just given and due to the conditions (1) and (3) of Subsec¬ 
tion 3. It remains to verify axioms (3) and (4). 

To verify axiom (3), we have to find a zero element in T. We 
will show that the zero in T is the pair 00, where 0 is the zero ele¬ 
ment of L and 0 is the zero element of L Let us first establish 
that no matter what the element e £ we have 00= Qb (simi¬ 
larly, 00 = 00 for any a^L). Indeed, by condition (3) pf Subsec¬ 
tion 3, 
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whence 

a 6 + 00 = a 6 4 -66 = (a + 0) & + aft 

Finally, if t = axb^-\- ... + 046 * is any element in T, then 
Z + OO —a| 6 |+ ••• + (^* 6 * + 60) 

= 0 ( 6 , + ... +aft 6 ft = / 

This establishes that the third axiom of a linear space holds true 
for T. 

The fourth axiom is easy to verify. Namely, for any t e T the 
additive inverse is (—1) -t. Indeed, 

t + (— 1 ) • t = a| 6 | + • •. + + (— 1 ) (a| 6 | + ... + ^* 6 / 5 ) 

= («! + (— 1 ) • ai) 6 | + ... + (Oi + (— 1 ) • O/t) bh 
= 6 • 6, + ... + 6 • 64 = 60 + ... + 06 = 00 


This completes the proof of our assertion concerning the set T. 

6. Definition. The linear space T, taking into account the con¬ 
struction of its elements in the form of sums of products of ele¬ 
ments of L and £, is termed a tensor product of the space L by £. 
In symbolic notation, we have 

T = L<Sil 

The elements of the space T regarded as sums of products of 
elements taken from L and £ are called tensors over the spaces L 
and £. 

7. Besides the spaces £ and £, let us consider their conjugate 
spaces L* and £* and let us introduce one more operation, called 
the contraction of elements taken from T with elements from £* 
and from £*. 

I'or an eleinent of T let us first of all take one pair ab, L, 
b £. Let ti e £*. We denote the contraction of the pair ab with 
respect to the second (right) element with the element n (right 
contraction) by (ab, u) and we define it by the equation 

(ab, u) = a(b, u) (3) 

Here, (b, u) is the contraction of the element 6 e £ with the ele¬ 
ment II e £*, as understood in the sense of Subsection 7, Section 1 
of (his chapter. Since (b, u) is a number, the contraction (3) is a 
vector collinear with a, that is, a vector in £. 

The contraction of the pair ab with respect to its right (left) 
element with the element v ^ L* (left contraction) is denoted and 
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defined by the equation 

(t), ab) = (V, a) b 

This is a vector of £ collinear with the vector b. 

We determine the contraction of the element i = Oibi + ... + 
+ ai,b/i e £ (a right contraction, for example) with the element 
L* termwise: 


(a,ft, + ... + akb,,, «) = a, (ft|, «) + ... +04 (ft*, «) 


8. The contraction of the element t e T with the element o e £* 
or with the element u ^ L* is invariant to admissible replacements 
of the element t. 

Proof. By the definition of a contraction and due to Subsec¬ 
tion 7, Section 1, Chapter V, the contraction is distributive with 
respect to addition of elements taken from T, L and £; numerical 
factors can be taken outside the contraction symbol. For this 
reason, an admissible replacement of the element t implies also an 
admissible replacement of its contraction with u or v. 

Corollary. If two elements t\, t 2 ^ T are equal, then the contrac¬ 
tions of t\ and <2 with respect to the right elements of their pairs 
with one and the same element u ^ L* are also equal {this asser¬ 
tion naturally holds true also for contractions relative to the left 
eldfnents). 

§ 3. Basis in a tensor product. Components of a tensor 

1. Now let L and £ be finite-dimensional; denote their dimen¬ 
sions by n and m respectively. Let . .. be a basis in £ and 

Cl . Bm a basis in £. Consider £ = £ 0 £. 

Lemma. If 

^\d\e.>d2-{- ... -|-c„d„ = 00 (1) 

where C, then 

5] — 02— ... =a„ = 0 (2) 

Proof. In £* we consider the basis e\ ..., e”, which is recipro¬ 
cal to the given basis in £. Take the left contraction of equation 
(1) with the vector e'. On the basis of Subsection 7, Section 2, we 
get the following equation: 

(e‘, e,) d| + (e', ^ 2 ) «2 + • • • + (e', e„) a„ = {e\ 0) 0 = 0 • 0 = 0 
whence 

1 . _J_ Q , ^^2 -j- ^ ^ ^ Q , = Q 
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CoiistMiuenlly, ui = 0. Similarly we find Ihe remaining equations 
(2) hy contracting (1) with e^, .... e'‘. This completes the proof 
of the lemma. 

2. Theorem I. All pairs CiCj are linearly independent in the 
space T. 

Proof. Suppose we have the relation 

Z a,/e,g, = 60 (3) 

t- / 

where a,j are certain scalars. Equation (3) may be written as 

From this and from the preceding lemma it follows that 

X a,ye, = 0 (4) 

for every number i. And since e, are vectors of the basis, it follows 
from (4) that a,-, = 0. Theorem 1 is proved. 

Theorem 2. The pairs e,ey form a basis in the space T. 

Proof. Let / e T and we have t = aifti + ... 4'aft^ft- Decom¬ 
pose the vectors a\, ..., ai, ^ L in terms of the basis e\, ..., e„, 

and the vectors b .. fe* e E in terms of the basis e\, ..., im- 

Then after grouping terms we get 

n m 

/ = Z Z T:“eiei (5) 

i=\ /=i 

where x'-i are certain numerical coefficients. By (5), any element 
in T is linearly expressible in terms of the pair etey, from this and 
from Theorem 1 follows Theorem 2. 

Corollary. If L and L have dimensions n and m. then the tensor 
product T ~ L L is finite-dimensional and has dimension nm. 

3. We will now specially consider three cases of a tensor pro¬ 
duct of two spaces when either the two spaces coincide or one of 
them is conjugate to the other. 

Let L be a space of dimension n and L* the space conjugate to 

it. Let Cl, ..., L’n be a basis in L and e'.e" the reciprocal basis 

in I*. ^ 

(1) The tensor product of Z. by L will be denoted by To. By 
Subsection 2, any element t^Tl = L^L has the decomposi¬ 
tion 


t — X X'^eiCy 


( 6 ) 
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The elements of the product tI = L(SiL are called contravariant 
tensors of order two over the space L. 

(2) We denote the product L* by L* by T 2 . The elements of 
this product are termed covariant tensors of order two over L. For 
any t = L* L* we have 

t = Y, (7) 

(3) Denote the product L by L* by T\. Its elements are mixed 
tensors of order two over L. For every t ^T\ = L L' we have 

t^Yv^ie' (8) 

Remark. The elements of the spaces L and L* themselves, that 
is, the contravariant and covariant vectors are also called tensors 
of order one (contravariant and covariant, respectively). 

4. The coefficients of the expansions (6), (7), (8) are called 
components of their tensors in the basis e\, ..., of space L. 
They are indicated by upper or lower indices depending on the 
structure of the tensor. How precisely is seen in (6), (7), (8). 

Remark. Tensors are defined by components, and so when we 
say “given a tensor”, we write the components, for instance x't 
(just as in analytic geometry we say given a point {x, y)). 

5. The elements of an arbitrary ny^n square matrix may be 
taken for the components of a tensor with respect to a given basis. 
Their specification in the form of an array (that is to say, in 
accordance with the indices) will always define a certain tensor 
in tI and also in TI, and in T"!. When passing to a new basis, 
the tensor components transform according to special laws that 
correspond to tI, t'l and T\. Let us find these laws. 

6. For tensor components in tI = L <S) L we have the contrava¬ 
riant law of transformation on both indices. 

This means that if in space L we pass to a new basis 

ei'=YP\'ei (9) 

then the new components of the tensor / g T\ will be expressed 
in terms of the old components by the formulas 

= Y Q'i 

i. I 

Proof. The inverses of formulas (9) arc 

= Z Qi 
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whence 

t = E I'V/ = E Qi'er ^ 

= E(E-^‘'Qi'Q/')erer ( 10 ) 

On the other hand, 

i=ET:‘'''ei'er (11) 

Comparing (10) and (11), we get (1) which completes the proof. 

7. The transformation law (1) was derived as a consequence of 
the invariance of tensors /sTr, relative to a choice of basis in the 
space. We made use of this invariance when we compared (10) 
and (11). Contrariwise, the invariance of the tensors t^To fol¬ 
lows from (I). In detail we have: if in the basis ei, ..., Cn ^ L 
are arbitrarily given the numbers t'-’ {i, j — 1,2, ..., n) and if 
when passing to another basis ei-, ..., e„'eLby formulas (9) 
they are replaced by the numbers x‘'i' in accordance with (I), 
then 

E = E 

Proof. Using (9) and (I), we get 

Z = Z (Z t“'Q:qO(Z PI'^.)(Z />{'^/) 

= E^^^iEQ::Pr)(EQl'P'r)e^e, 

= z = E 

8. For the components of tensors in T 2 — L*<8) L* the covariant 
law of transformation on both indices is valid: 

rrr = E-^iiPrPr (H) 

For components of tensors in T\ = L0 L* we have the contrava- 
riant transformation law on the upper index and the covariant law 
on the lower index: 

ri'=EriQiPr (III) 

Both formulas (11) and (111) are derived from the invariance of 
tensors in T’^andr' relative to the choice of basis in L, exactly 
like formula (1). Here, to derive formula (11) we have to use the 
familiar equations 

c' = E Q1 e' 

in place of (9). To derive (III), use both (9) and (12). 


(12) 
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Remark. In turn, the invariance of tensors in f^and T\, in the 
sense that it is explained in Subsection 7 for tI, follows from (II) 
and (IH). 

9. Linear operations on tensors in T}, To and T”! are expressed 
in terms of components by the usual rules. 

(1) When tensors are added, their components are added. For 
example, if 

t = Z ^TI, s = Z o"eiej e To 

then 

/ + s = Z ^ T'o 

Quite naturally, the addition of tensors of different types (struc¬ 
tures) is not defined. If we added their components, the result 
would not be invariant. 

(2) In multiplying a tensor by a scalar, all its components are 
multiplied by that scalar; for example, for /sTo, 

af = Z o.x‘'ejei 

♦ 10. The expression of a contraction in terms of components re¬ 
quires a somewhat more detailed explanation. Suppose we have a 
second-order contravariant tensor i = Y, ^Tn and a cova¬ 
riant vector Let us for example consider the 

right contraction t with the vector «. We have 

(L «) = ( Z Z = Z («/. e*) 

= Z = Z ( Z t'V) Cl 

Thus, as a result of this contraction we obtain a certain vector 
x=Y^‘^i^^ whose components are found by summing over 
the second index of 

x'=Zt% 

Similarly, in the case of a left contraction 

(u, t) = Z (Z 

we obtain a vector ^=Z/Ai^L whose components are found 
by summing over the first index of t'J; 

//=Zx% 
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11. If /=^T,/eVe7’2 and jc= X ^ then the right 
contraction 

(/. Jc)= Z(ZT.>r*)e' 

Tins is the vector u = Z Uie' ^ L*, that is, a vector of the space 
conjugate to L. Its components are found by summing over the 
second index of t,,: 

«i = Z 

The left contraction is also a vector in L* and reduces to summa¬ 
tion over the first index of t.-j: 

(jc. 0== Z(Z 

12. In case / = T/e,e^ e t! a contraction is also possible 

with the vector x = Z and with the vector m= ^ u^e'^^L*. 

Namely, 

{t. •>c) = (Z T/c/, £ x'‘ek)= S T/x‘e,(e', e^) 

= Z(ZtIv% 

Similarly 

(<. h) =(Z x'je.e', X e’‘)e^= E (Z 

Thus, if X ^ L, then (t, x)^L. If u e L*, then {t, H)e L*. 


13. For /eTi we have what is called an inner contraction, 
which consists in the replacement of every pair a,ft; in 1 = ait»i -j- 
-f ...-j-(Oi e L, ft, e Z,*) by the contraction (a,-, 6,). This 
definition is not connected with the choice of basis, and for this 
reason the inner contraction of a mixed tensor t is an invariant 
number dependent solely on the choice of the element t in the 
space rl. 

If we denote the inner contraction of t by {/), then, in an ar- 
bilrary basis, wc have 

(/) Z 1^/ = Z x‘M = Z ■f* = + ... + t" 


whence it is also possible to derive the invariance of an inner con¬ 
traction as a consequence of formula (III). Indeed, from (III) we 
have 



Z T/^Z Qi Fjl'j = 


Z = Z 'fft 


14. Wc see that in all cases a contraction reduces, in compo¬ 
nents. to summation over one contravariant (upper) index and 
over one eovariant (lower) index. In this process, the total order, 
that is the total number of indices indicating the tensor compo- 
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nenls, is reduced by two. When one index remains, the resull of 
contraction is a tensor of order one (a vector of L or of L*). For 
example, the contraction = leaves one free (upper) 

index and yields the vector In cases where no free 

indices remain (as in Subsection 13) a numerical invariant results. 
For this reason, numerical invariants arc often called tensors of 
order zero. 

§ 4. Tensors of bilinear forms 

1. Suppose in n-dimensional linear space L we have an inva¬ 
riant bilinear form a{x, y), x, y ‘= L. If a basis e\, ..., e,, is spe¬ 
cified in L, then the form a{x, y) in this basis has the coordinate 
(component) representation 

a(x, y)=Y atix‘y‘ (1) 

Pass to a new basis in L: 

ev = Z 

In the new basis, the form a{x, y) receives a different component 
representation with new coefficients ai’j'. By Section 3, Chap- 
ier IV, 

a/r = Z (UiP'i'Pr 

Thus, the coefficients of the bilinear form in L transform by the 
covariant law for each index (see (II), Subsection 8, Section 3). 
Therefore, we can associate to the form a(x, y) a tensor from 
T\, namely 

a = Z »iie'e' (2) 

It is called the tensor of the given bilinear form. From Subsec¬ 
tion 7, Section 3, it follows that the tensor a is associated with 
the form a(x, y) invariantly (that is to say, it is the same, irres¬ 
pective of the choice of basis in the space L). 

Conversely, to any tensor (2) in Tl there corresponds a bilinear 
form in L. Indeed, if we perform a left contraction (2) with the 
vector x = Y ^ ^id then contract the thus found (in L*) 

vector with the vector y= Z ^ the result will be the right 
member of (1). Thus, tensor a is associated with the bilinear form 

a {x, y) = {(x, a), y) = Z (3) 

The invariance of such a construction of a bilinear form in L via 
a preassigned tensor in T'i is obvious, since a contraction is an 
invariant. 
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2 . In order to establish a similar relationship with the theory 
of bilinear forms for tensors in T‘i and t\ it is necessary to con¬ 
sider bilinear forms of two covariarit vector arguments and bilinear 
forms one of whose vector arguments is contravariant and the 
other is covariant. Both forms are defined as numerical-valued 
functions linear in each argument. Besides we must demand their 
invariance, that is to say, that their numerical values be indepen¬ 
dent of any choice of basis (see Subsections 3 and 4 below). 

3. The bilinear form a(u, v) with two covariant arguments 
H = X ^ and v= Y. ^ fias the component represen¬ 
tation 

«(»- y) = Z 

with coefficients 

a" — a (e\ e^) 

from which and from (I*), Subsection 3, Section I, 

a‘ ‘’ = a e' ) = Z « ^0 Qi Q/ = Z a'Q'i Q't' 

Thus, the coefficients of the form a{u, v) transform by the contra¬ 
variant law for every index (see (I), Subsection 6, Section 3). The¬ 
refore, to the form a{u, v) is invariantly associated a tensor of 
To, namely, 

fl = Z a'^eiCi 

Conversely, to any preassigned tensor a^Tl there corresponds, 
in the form of a contraction, the invariant bilinear form 

a («, v) = ({«, a), u) = Z 

4 . For a bilinear form a(x, u) with two distinct arguments 

x—Y ^ = Z ^ we have 

a {x, u)—Y 

where 

ai = a(e^, e<) 

whence 

a'r = a (er, e'') = Z « («/. ^0 PrQi' = Z a{p’i Q\' 

Thus the coefficients of the form a(x, u) transform by the cova¬ 
riant law for the lower index and by the contravariant law for the 
upper index. Therefore, the form a{x, u) is invariantly associated 
with a tensor from t\, namely 

a == Z 4e‘e] 
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Conversely, to every prespecified tensor aeri there corresponds 
a bilinear form 

a (at, «) = ((«, .r), u) = "£ a{x‘uf 

5. Note that the orders of the tensor of a form are opposite to 
those of its arguments. For example, if a certain argument of a 
form is covariant, then the corresponding index of the tensor of 
that form is contravariant (upper). 

6 . The formulas (2) and (3) of Subsection I establish a one-to- 
one correspondence between bilinear forms in L and tensors in Ti- 
This correspondence is obviously an isomorphism relative to linear 
operations. Thus, in the sense of linear algebra the theory of ten¬ 
sors in it is equivalent to the theory of forms a{x, y) in L. The 
same may be said with respect to the theory of tensors in To and 
in T\ and the theory of forms a(u, v) and a{x, u). 

However, the construction of a separate theory of tensors (besi¬ 
des the theory of forms) is necessary. Firstly, the contraction ope¬ 
ration does not fit into the framework of the theory of forms. Se¬ 
condly, tensors may be correlated not only with forms but with 
many other entities of algebra and geometry (and also mechanics 
and physics). This correlation makes it possible first of all by 
§feneral methods to construct invariants of the entities under study 
(mostly in the form of contractions). Besides, there thus appears 
the possibility of expressing relationships between entities in the 
form of tensor equations, that is, as equalities between tensors. An 
important peculiarity of tensor equations is their invariance. 

7. Suppose, for example, we have the equation 

t'' = 0 (,) 

What it means is that, relative to a given basis, all components 
of a certain tensor in tI are zero. But then the components of 
this tensor are zero in any other basis as well. Arithmetically, this 
is evident from formula (I), Subsection 6, Section 3, but in ac¬ 
tuality it follows directly from the very definilion of tensors of tI 
as invariant objects (equation (4) expresses the invariant fact that 
the tensor is the zero element of To)- Quite natu¬ 

rally, of course, the same may be said of an equation of the form 
Xij = 0 and jj = 0. 

Because of the invariance of tensor equations, their validity can 
be proved merely by a verification relative to some one (conve¬ 
nient) basis. This simple bit of reasoning will be made frequent 
use of in what follows. 


(j-UGl 
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8 . Wc give here a test for distinguishing tensors of order two. 

Suppose an object A is defined relative to some basis ei . 

of the space L with components 0,7, but it is not known how the 
components change in passing to another basis. The following 
proposition holds. 

// a contraction of the components 0 , 7 , over some index with the 
components of any contravariant vector always has a covariant 
transformation law relative to the remaining free index, then the 
components aik transform by the covariant law for each index. 

Proof. We carry out the proof for a contraction over the first 
index. Let be an arbitrary contravariant vector (that 

is, X e L). We consider the contraction 

= Z a,kX^ 

By hypothesis, we can regard bh as the components of some vector 
b ^ L* (relative to the basis e',..., e”). Let us now take in L 
another, also arbitrary, vector i/= Z Then 

(ft. f/) = Z ftfcf/* = Z ciikX^t/ (5) 

is an invariant. Consequently, the right member of (5) is a compo¬ 
nent representation of an invariant bilinear form. Using this fact 
and Subsection 1, we complete the proof of our proposition. 

Remark. By the proposition just proved, the object A is in- 
variantly correlated with a tensor from Tr Accordingly, this pro¬ 
position may be taken as a test for covariant tensors of order two. 

9. Similarly, if 

,/-T. «'•«, 

is a contravariant vector for any choice of a covariant vector uu 
then o'" is a contravariant tensor of order two. If 

y* = Z 

is a contravariant vector for any choice of the contravariant vec¬ 
tor A", then of is a mixed tensor. 

Both assertions reduce (like the previous one) to Subsections 3 
and 4. 

§ 5. Multiple-order tensors. Tensor product 

I. Since a tensor product of two spaces has been defined, we 
thus have a definition of a tensor product of any number of spaces. 
It suffices to multiply them together in any order. If the linear spa¬ 
ces L|, Li, Li are given, then the product r = (Z.i ® Z, 2 ) <8 L 3 has 
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for its elements the symbolic sums of any finite number of terms 
of the form {ab)c, where a ^ L\, b ^ L^, l ^ L 3 . The conditions of 
equivalence and, respectively, the admissible replacements of the 
elements of T are obtained liy combining the conditions of equiva¬ 
lence of the elements of the product L\ <* A., by L 3 and the product 
of L[ by L 2 . For example, 

((a' + a") b)c = (a'b) c + (a"b) c, 

((an) b)c = {a (ab)) c = (ab) ac 

where a is a scalar. Besides that, we require the equivalence con¬ 
dition of an associative nature: 

(ab) c — a (be) 

It signifies the identity (L\ < 8 > L^) L 3 = Lj ®(L 2 ^ 3 ) and per¬ 

mits writing abc instead of (ab)c or a (be). Linear operations in T 
are defined together with the product Li <81 L2 by L3. Also defined 
is a contraction: a left contraction with elements of L] and a 
right contraction with elements of /.*. It is also necessary to de¬ 
termine a contraction over the middle element of each triad with 
the element u e L', by setting 

(a,&|C,-f ... +at6tCt., «) = (6| «)a|C, + ... +(6*. 

This contraction is an element of the product Li ® L3. 

2 . A tensor product of any number of spaces is determined by 
induction. 

3. Suppose we have a linear space L. Set 

fP = (Z. (g) Z. ® ... ® L) ® (L* ® Z,* ® ... ® L*) 

where there are p factors L and q factors L*. Elements in Tq will 
be called tensors over the space L, contravariant of order p and 
covariant of order q. For the sake of uniformity we also denote L 
by To and L by T'l which is in conformity with the condition by 
which elements taken from L and L* are termed tensors of order 
one. 

4. Since every space Tq is linear, linear operations are defined 
for tensors in each of these spaces. We do not define the addition 
of tensors taken from different spaces Tql and Tql- 

5. Besides linear operations on tensors, in each Tq we define 
the product of tensors taken from any, even distinct, spaces Tqi 
and 
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I.et r e Tj,', s e T^q\. The product r by s is defined to be the 
ordered pair rs regarded as an element of the tensor product 
r'q, 0 Tq',. In general, rs ^ sr. By Subsection 1 we have, for any 
r, s, i, 

(rs) t = r (st) 


Accordingly, we obtain a product of three tensors: rst ={rs)t = 
= r{sl). Thus also is defined the product of tensors of any number 
of factors. 


6. Suppose r and s can be represented as a single term, that is, 
in the form of a product of elements taken from L and L*: 


where 

r = a, ... 

ap,b, .. . bq„ 

s — di 

• • • ^pP\ • • • 

«!, . 

• • 1 Upit ' 

..., dp, e L, 

b . 

, bq,, b^, , . , 

Then 

rs = 0 ] 

1 • • • ^Pi^i • • • 


. dp,b^ ... bq. 


If rs contains factors solely from L or from L* alone, then their 
order is essential (by Subsection 1 of Section 2). In the general 
case, let us agree in the product rs to write out first all elements 
of L and then all elements of L*, retaining in each case the given 
sequence of the factors. Thus 

rs Uf ,,, ap^a\ ... ap^a\ ... b(f^bl ... bq^ 

This introduces a new condition for the equivalence of tensors. It 
is generally accepted in most works on tensor calculus but not in 
all (see, for example, Sternberg’s Lectures on Differential Geo¬ 
metry [23]). 


7. Set a = ai, ..., Op,, assuming that a .. Up, may be ar- 

bilrary elements of L. If pi = 0, then we agree to put a = 1. We 
will regard d, b and B in similar fashion. Then the arbitrary ten¬ 
sors r and .s (r s Tql, s e Tj’) may be symbolized as 

r= ab, s=Yidb 

By Subsection I, we can obtain the product rs by a termwise mul¬ 
tiplication of the first of these sums by the second. Taking into 
account Subsection 6, we have 

rs = 

whence it is clear that rs s Tgl+qf. 
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8. Let t e r^withp^ 1 and q'^\. As before we put a = a, ...Op, 
b = bi ... bq, and 


t= ob 


From among 1, 2, .... p we choose a number / and from among 
1, 2, .... p we choose a number m. Denote by a' the product of all 
elements Ui, ..., a,,, with the exception of o;, and by b' the product 
of all elements bi, .... bg, except 6,„. We define the inner contrac¬ 
tion of a tensor / over the /th elements of L and the mth elements 
of L* to be the entity 


WL= Z («/. bja'b' 

Here, {at, bm) is the contraction of the element at ^ L with the 
element b„, e L*, which is to say it is a number (distinct for every 
term of the sum). Thus, the contraction (0^, is a tensor of TqZ\. 
If p— 1^1 and q — 1^1, then in turn we can apply the con¬ 
traction operation to the tensor (0^, to obtain a tensor from TgZl. 
Both operations can naturally be done at once. For example, 

WI 2 = (Wi')2 = Z («i> * 1 ) («2. l>2) a, ... apt>3 ... bg 

If p = q, it is possible to exhaust all the orders of the tensor t 
and obtain what is called a complete contraction (a scalar). 

9. The contractions of two tensors that we examined above can 
always be reduced to the inner contraction of one tensor. All that 
is needed is to multiply the given tensors and perform an inner 
contraction of their product. For instance, a contraction of the ele¬ 
ment L with the element u e L* is an inner contraction of the 
product XU, the right contraction of the tensor t with the ele¬ 
ment u e L* is {tufv 

10. In concluding this section, we have a number of very im¬ 

portant remarks to make concerning operations on tensors. First 
of all, because of what was said in Subsection 1, linear operations 
in Tq are invariant to admissible replacements of elements in 
Tq. The product rs^Tq'tqZ ^ Tj,' and s^Tq"’ is invariant un¬ 
der admissible replacements of elements r and s in Tq', and Tq], 
that is, under such transformations it itself also receives an ad¬ 
missible replacement in Without these properties, the defini¬ 

tion of linear operations on tensors and of a tensor product would 
be meaningless. 
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11. From what lias been said in Subsection 1 of this section and 
Subsection 8 of Section 2, it follows that under an admissible re¬ 
placement of the tensor t in Tq the contraction (t)m also under¬ 
goes an admissible replacement in TqZ\, and so from this we have 
that if I 2 = t\, then 

Proof. The equation <2 = ^1 signifies that the tensors Z 2 and Zi 
can, by means of admissible replacements, be reduced to one and 
the same collection t of products of elements taken from L and L*. 
But then (Z,)^ and are reduced, via admissible replacements, 

to at 

12 . We have not made any use of bases in our definitions. For 
this reason, linear operations on tensors and also the operations 
of tensor multiplication and inner contraction yield results that 
are invariant in the sense of independence of choice of basis. 

In particular, the complete contraction of a tensor is a numerical 
invariant. 

§ 6. Components of multiple-order tensors 

1. Let e\ .be a basis in L and e*. ..., e" the reciprocal 

basis in L*. From Theorem 2, Section 3, it follows (by induction) 
that all possible products of the type ... e'<> constitute 

a basis in Tq. 

Thus, for any Z e we have the decomposition 

Z ~ 2 ... ® * ... 6 ‘I 

'1 • 'q '1 ‘p 

The numbers t!' ”*/ define the tensor Z and are called the com- 

'1 ■■■’q 

pononls of (he (ensor relative to (he basis ei, ..., e„ of the space L 
Relative (o a given basis, (hey can be specified arbitrarily, that is, 

no matter what numbers t ' are taken, a certain tensor is al- 
ways defined. One often writes: and states that 

“a (ensor is given”. It is well to bear in mind, however, 

dial (ho aclual specification of some concrete tensor, even a tensor 
of order Ihrec, reipiires rather involved information in the form 
of tables, since (he numerical values of the components must ba 
indicated for every combination of indices. 
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2. When passing to a new basis, the " transform by 

• I ••• Ig ^ 

the contravariant law for every upper index and by the cova¬ 
riant law for every lower index, namely, 



'’Q ‘ . Q '’P ! ..P‘1 


where the summation is over unprimed indices. 
Proof. By (I), (T) of Section 1, 

Ci' = X e' = Z Qi 


(*) 


The reciprocal formulas are 

ei=Y,Q'i'ei, e‘ = Y,Pl'e‘’ 

whence 



0»i the other hand. 


. v ■■■ 'p 

i J • 7/7 * 1 *n 


H) 


( 2 ) 


Comparing (1) and (2), we get (•»), which is what we wanted. 


3. We derived the transformation law (*) as a consequence of 
the invariance of tensors i^Tq (we took advantage of the inva¬ 
riance of t when we compared (1) and (2)). Contrariwise, the 
invariance of the tensors t^Tq follows from (•»). Namely, due 
to 


z 






/ 

e 

i 


I 


e 


‘1 


We will not derive this equation from (k) but will refer the reader 
to Subsection 7, Section 3, where the essence of the matter is de¬ 
monstrated in a special case. 


4. Linear operations on tensors taken in some Tq are expressed 
in terms of components by the ordinary rules; in addition of tensors 
the components are added, in the multiplication of a tensor by a 
scalar the components are multiplied by that scalar. 
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5. In the multiplication of a tensor r^Tql by a tensor s^Tq] 
each component of r is multiplied by each component of s and all 
such products constitute the components of tensor rs e Tql+q’. 
For example, if r = snd s = ^ S/e' (r, s e r”), then rs = 

= ^risie‘e' ^T‘ 2 . Assuming rs = t = Y 4 tiie'e^, we get tii = riSi. 

Remark. In general, rs ¥= sr. The inequality of rs and sr is 
also readily discernible in components. Indeed, setting sr = t = 
= Yj we get i — Y StricU' = Y Sirie‘e', whence tn = 

and til = 7 ^ til- 


6. The components of the contraction (Om are obtained from 
the components of the tensor t by summing over one upper and one 
lower index, the upper index occupying the /th position, and the 
lower index the mth position. This is best seen in a concrete 
example. Let 

j V' f/ a B 

t == Z. Uupe.e/e 

Then, for instance. 


(O2 = Z {ei, «**) alpe.e* = Y = Y (S eie’^ 

Thus, the components (/i) are sums of the form where the 

summation is taken over the first upper index and the second lower 
index. 


§ 7. Multilinear forms and their tensors 

1. Suppose we have an invariant numerical function a(xu ..., Xq, 
u', ..., u>‘) of the vector arguments X|, ..., jc, e L, u', ..., 
ui’ e L*. Such a function is called a multilinear form if it is linear 
in each argument. 

2. Suppose in space L we have chosen a basis ei, e„ and 

in space L* a reciprocal basis e' .e". Then each contravariant 

argument x/, e L of the given form may be decomposed in terms 
of the basis Ci, . .., e„: 


Xk == X + . • • + XkBn 

Similarly, each covariant argument may be decomposed in terms 
of tlie basis e', .. ., c": 


u 


k 


Z k ■ k ■ , I k n 

Hie —uie + ... +«ne 
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whence 


^ (•^i> • • • 1 



Thus, by the linearity of the form we obtain its component repre¬ 
sentation 

a{x . . u\ ..uP) = 'Za'i' '" ... xf ■ u\ ... ( 1 ) 

where 

“/,... tl~ “(^/,’ ^ ® '') (2) 

are the coefficients of the right member of (1). According to (2) 
they are the values of the form on the basis vectors. 


3. When passing to a new 




e,' = E P,' 

Ik ^ Ik 




(3) 


and respectively in L*, 


=eq: 


^kjk 


(4) 


In the new basis, the component representation of the form will 
have new coefficients. Because of the invariance of the form, they 
too are its values on the basis vectors (which, naturally, are new). 
Thus, 





(5) 


From (5), with account taken of (2), (.3) and (4), we find 


a\ 




( 6 ) 


where the summation on the right is over unprimed indices. 

We see that the transformation law (6) of coefficients of the in¬ 
variant multilinear form a{x\ . x.^, «'.«'’) coincides with 

the transformation law (•:(■) of Section G of the components of a 

tensor in Tq. Hence, to every form a{x\, ..., x,, «'. up) is in- 

variantly associated a tensor in T’’,: 


Ya' "' 'Pe 


It is called the tensor of the form. 


( 7 ) 
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Conversely, to every tensor (7) is associated an invariant mul¬ 
tilinear form (I). Note that this form is a complete contraction of 
the product fl.Vi .. . ... up. 


§ 8. Symmefrization and antisymmetrization (alternation). 
Skew-symmetric forms 

1. Let us consider, relative to the basis ei .e Z,, the mul¬ 

tilinear form 

a(x, n, z . t)='L(‘iik ... (•) 

and the corresponding (covariant) tensor a,,;, The form (I) 
is said to be symmetric in a given pair of arguments if interchang¬ 
ing them does not change the value of the form. For example, 
a(x, y, z, ..., 1) is symmetric in the first and third arguments if 
a(x, y, z, . .., l)= a(z, y, X, ..., i) for any x, y, z, ..., t ^ L. 

The symmetry of a form in a given pair of arguments implies the 
symmetry of its tensor in the corresponding pair of indices. In our 
case we have symmetry of the tensor in the first and third indices: 
^ijh ... s — ^hji... S’ indeed, 

^Uk ... s~^ i^l' • • • ’ ^s) ~ ^ ^/’ .“ ^ku ... s 

Conversely, if, say, ank... $ = a,,a ... s, then 
a(x, y, z, ..., t) = Y. a.f^ ^x^y'z'' ... f 

= 'Lakii = = a{z, y,x, ..., t) 

A multilinear form is said to be symmetric if it is symmetric in 
every pair of arguments; associated with a symmetric form is a 
symmetric tensor. 

2. The form (1) is said to be skew-symmetric in a given pair of 
arguments if interchanging them alters the sign of the form. Say, 
o(.v, //, z, ..., t) is skew-symmetric with respect to the first and 
third arguments if a(x, y, z, ..., t)= —a{z, y, x, ..., t) for ar¬ 
bitrary X, y, z, ..., t. The tensor of such a form is skew-symmetric 
with respect to the first and third indices, that is, aijh...s = 

O/iji . . . .s. 

A multilinear form is said to be skew-symmetric (or skew) if it 
is skew-symmetric with respect to every pair of arguments. Asso- 
cialed with a skew-symmetric form is a skew-symmetric tensor. 

A skew-symmetric form does not alter its numerical value under 
any even permutation of its arguments. Under any odd permuta¬ 
tion of its arguments, a skew-symmetric form is multiplied by 
minus unity. 
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3. Symmetry or skew-symmetry of a form of covariant argu¬ 

ments and, correspondingly, of a contravariant tensor is defined 
in complete analogy with the foregoing. In the case of a mixed 
tensor, the properties of symmetry or skew-symmetry can occur 
for lower indices or for upper indices. But for a pair of indices of 
which one is lower and the other upper, these properties are not 
invariant. For example, for a tensor in some basis we can have 
the equation a^=al{i, k = I, 2, n), but in general it does not 

hold true in a change to another basis. 

4. If a(x, y, z, ..., t) is an arbitrary multilinear form, then it 
can be associated, via a definite standard, with a symmetric form 
of the same arguments. Namely, 

{a(x, y, z, y, Zy t) 

where the sum on the right is taken over all permutations of the 
symbols x, y, z, t\ m is the number of these symbols (the 
number of arguments). This operation is called symmetrization 
and is denoted by parentheses (round brackets). In the special 
cases of m = 2 and m = 3 we have 

• , 

(a (x, y)) = {a (x, y)-\-a (y, x)}, 

(a (x, y, z)) = -^ {«(x, y, z} +a (y, z, x) 

+ a (z, X, y)-\-a {y, x, 2) + a (x, z, y)-\-a (2, y, x)} 

Associated with symmetrization of a form is symmetrization of a 
tensor. For instance, 

i^up = {«r/ + «/i}- 

(^Ulk) = [^iik + Ojki + cikii + a/ik + Oiki + fl/t/i} 

If the form a is symmetric, then (a)= a. 

5. We have to do, for example, with symmetrization of the tensor 
of a multilinear form when the arguments of the form are iden¬ 
tical. For instance, if 

a (x, </) = Z iiikx‘i/ 

is a bilinear form (in general not symmetric) and we construct 
the quadratic form a(x, x), then we can collect terms thus: 

Ojfc.x'x'' + akix'’x‘ = ( 0,4 + Oki) x'x* = 2 auk)x‘x’‘ 
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The coefficients of the resulting quadratic form are the numbers 
(«(,;,); they constitute its (symmetric) matrix. The polar form of 
a(A, x) is the form (a(jc, y)). Similarly, a(,j 7 ,) are called the coeffi¬ 
cients of the cubic form a(x, x, x) obtained via identification of 
the arguments of the trilinear form a(x, y, z). 

6. The operation of antisymmetrization (alternation) consists in 
the following: an arbitrary multilinear form a{x, y, z, .. ., t) is 
associated, via a definite standard procedure, with a skew-sym¬ 
metric form of those same arguments, ft is denoted by square 
brackets and is defined by the equation 

[a{x,y,z,.. y> — • • - .0 } 

where the first sum is taken over all even permutations of the 

symbols x, y, z . t, and the second over all odd permutations. 

For instance, 

(a (x, y)\ = -^ {a (x, y) — a (y, a:)}, 

[a (.V, y, z)] = ^ ^y> •^) 

+ a (z, X, y) — a (y, x,z) — a (.t, z, y) — a (z, y, .f)} 

Accordingly, we have the operation of antisymmetrization (alter¬ 
nation) of a tensor: 

<2|«7ft| = -gj- {Oiik + (llkl + dkU ~ <^llk ~ <^ikl ~ O’kti) 


If the form a is skew (skew-symmetric with respect to all argu¬ 
ments), then [«] = a. 

7. We note two simple properties of the set of skew-symmetric 
forms: 

(1) The skew-symmetric form vanishes even if only two argu¬ 
ments are identical. 

Indeed, suppose, say, the first two arguments of the form coin¬ 
cide. Then a{y, y, z . t)=—a(y, y, z . t). Hence, 

2a(y, I/, z, ...,/)= 0. 

(2) If the number of arguments of a skew-symmetric form ex¬ 
ceeds the dimensionality of the space, then the form is identically 
zero. 
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Indeed, in this case the arguments are linearly related and hence 
one of them is linearly expressible in terms of the others. Suppose, 
for exarnple, x = ay + U. Then 

a(x, y,z ./) = aa {tj, ij,z .t) + Pa iz, y, z .0 + 

... + Aa (f, //, 2. t) 

and by the first property a (y,y,z, ..., 0=0, ..., a (t, y,z, ... 
..., 0 = 0, whence a {x, y,z ./) = 0. 


8. We now consider skew forms in which the number of argu¬ 
ments is equal to the dimension of the space. 

Let a{x, y, ..., t) be an arbitrary skew form of n arguments 
X, y, ..., t belonging to an n-dimensional space L. Fix an arbitrary 
basis ei, in L and decompose the arguments of the form 

relative to this basis. By Subsection 2, Section 7, we have 

a(x,y,..., 0 = Sa/,(j... <•,/'//' ■■■('" (2) 

where 

a;,/,... ,„ = a(a.,, ..., 

From the definition of a skew form and from property (1) of the 
preceding subsection, 

a(e.^, ..., = ^,/j.../„• ^(^i> ^2’ •••> ®n) 

that is, 

a,- I { I i ®l 2 n 

Substituting (3) into (2), we see that the form at hand can be 
represented relative to the given basis as 

a(.v, (/,...,/) = a, A (4) 


where A is a determinant made up of the components of the argu¬ 
ments; 




x'x^ . 

.. X" 



//'«/* • 





.. /'* 


We will call the coefficient Oi 2 ... n the principal coefficient of 
the skew form (2). All the other coefficients of the form are either 
zero or equal to ± ci 2 ... n (in accordance with formula (3)). 

If ai2...n=0, then the skew form aix,y . t) is identically 

zero. 
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If 012 ...,, #=0, Uien a{x,y, .... t) is different from zero when 
the arguments are linearly independent. 

From formulas (4) and (5) it follows that, up to a numerical 
factor, there is only one skew form of n arguments in the space L 
Indeed, if the given form a(A:, y, ..., i) is not identically zero, 
and b(x, y, ..., t) is any other skew form of n arguments, then 

b{x, y, = ^ = ^■ a(x, y, .... t) 

where 



Note that the reasoning in this section is carried through relative 
to a given basis and only the skew symmetry of the forms is uti¬ 
lized and not their invariance. We will lake advantage of this fact 
in the next chapter where we consider skew-symmetric multilinear 
functions whose numerical value is not invariant with respect to 
a change of basis. 

9. A more detailed study of skew-symmetric forms and skew- 
symmetric tensors is made in Chapter X. 

§ 9. An alternative description of the tensor product 
of two linear spaces 

1. At the beginning of this chapter, in Section 2, we defined a 
tensor product T = L (S) £ as a set whose elements are any finite 
collections of pairs consisting of elements of L and £. Thus, if 
certain specific entities are chosen as elements of L and £, then 
the elements of £ <8 £ are quite concrete (collections of pairs of 
these entities). In order to make £ ^ £ a linear space we had to 
include a description of admissible replacements and linear opera¬ 
tions for the elements of the set £ <8 £. Since we had to construct 
tlie linear space £ O £, the description of a tensor product ap¬ 
peared lo be rather unwieldy. 

We now give a definition of the tensor product £ <8> £ that is 
quite independent of the one previously described. It will be more 
economical in the sense that for £ <8) £ we will from the start take 
a certain linear space T (the meaning of the equation T = L <S) L 
will consist in establishing only certain interrelationships between 
the eletnents of £, £ and f). Unfortunately, the new definition will 
have some defects of its own. The point is that there will be a good 
deal of arbitrariness in the new construction of the tensor product 
L <S> f. (unlike tlie old construction, where the elements of £ ® £ 
are completely determined by the elements of £ and £). For this 
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reason, first of all, given L and L, we define not one tensor product 
L ® L but a set of distinct products. But then we define a certain 
natural, notion of the isomorphism of these tensor products. They 
will turn out to be isomorphic (equivalent) to one another and 
also to the tensor product L by L in the meaning of our old con¬ 
struction in Section 2. Of course, this will require a certain amount 
of time and energy with the result that there will probably be no 
saving after all. 

2. We give this alternative description of a tensor product with 
the aim, as far as this is possible, of helping the reader to clarify 
the following problem. 

Let it be said that a linear space T is the tensor product of a 
linear space L and a linear space C. We confine ourselves here to 
the finite-dimensional case. Then the dimension of T is equal to the 
product of the dimensions of L and L. Naturally, by itself this re¬ 
lation of dimensions does not suffice to characterize T as the 
tensor product L <S> L The point is that in specific instances we 
may not be able to perceive the construction of a tensor product 
as described in Section 2. What is more, all three spaces, L, L 
and r, may be given to within linear isomorphisms, and then the 
symbolic sums described in Section 2 are replaced by elements of 
quite a different nature. Therefore, an answer to the question of 
what it means that T is the product L ® L actually cannot be 
given in such a general situation on the basis of Section 2. This 
requires giving the very definition of a tensor product in a more 
general form, which is precisely what we now intend to do. 

3. Suppose we have linear spaces L and L of dimension n and m 
respectively. Let there also be a linear space T whose dimension 
is equal to the product nrn. The spaces L, L, T are assumed to be 
all real or all complex. 

Further, let there be given a certain map f of the two spaces L, 
L into the space T. This means that there is an element t as¬ 
sociated with an arbitrary pair of elements a, a, where a ^ L, 
L For the sake of simplicity we write 

t = ad (1) 

thus identifying the pair aa with ils image t — f{a, a) in the 
space T. Let us agree from now on lo write the element of L first. 
If L coincides with L, then the pair in the right member of (1) 
will be taken to be ordered. Thus, in general, ad #= da, that is, the 
elements which in T correspond lo the pairs ad and da by virtue of 
the map f do not necessarily coincide. 

We assume that the following properties of the map f hold true. 
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(1) Distributive property (with respect to each element of a 
pair): 

(a -\r b) a = aa ba (2) 

a{a-\-b) — aa-\-ab (3) 

for any a, b^ L, a, b ^L. 

(2) Associative property: 

(aa) d = a (aa) = a {ad ; (4) 

for any a ^ L, d ^ L and for any scalar a (real or complex, de¬ 
pending on whether the spaces L, L, T are real or complex). 

By virtue of the properties (2), (3), (4), there are sufficient 
grounds to use the word “product” in place of “pair”. Accordingly, 
equation (1) is to be read thus: element t of space T is the pro¬ 
duct of element a of L by element a of L. 

Remark. Now, equations (2), (3) and (4), unlike the similar 
equations of Section 2, express the properties of the map f and 
not the conditions of admissible replacements in T. The point is 
that the admissible replacements have already been given in T 
beforehand together with the definition of T as a linear space. 

(3) The property of nonsingularity of the map f: if the elements 
Cl, ..., a„ are linearly independent in L and the elements 
Oi, ..., a,„ are linearly independent in £, then the system aidi, 
(i = 1, 2, ..., n; = 1, 2, ..., m) is linearly independent in T. 

By the foregoing, some of the elements of T are products of ele¬ 
ments of the spaces L and L. For example, due to (4) the zero ele¬ 
ment in T is the product of the zero element of L into any element 
of £, or of any element of £ into the zero element of £. However, 
not every element of £ is a product of some element of £ by some 
element of £. Also, it is easy to prove the following assertion: 
every element t is a linear combination of products of elements 
taken from L and £. 

Proof. Let ei ...,e„ be a basis in L and e\,...,e,n a basis in L. 
Then, due to condition (3), the system of all product pairs e.aft is 
a basis in T (since the dimension of T is equal to nm). Hence, any 
clement / T can be represented as 

t = 'Zt%h (5) 

where / = 1,2, ..., n; ^ = 1,2, ..., m. The proof is complete. 

Remark. If we put = and ek = dk, then (5) takes 

the form 

t = Yj 0*0* (6) 

Thus every element t e T can be represented as a sum of products 
of elements taken from £ and £, 
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4. Definition. A linear space T of dimension nm regarded to¬ 
gether with a given map / into T of a pair of linear spaces L, L of 
dimensions n and in, respectively, is called a tensor product of L 
by L \i f satisfies the conditions (1), (2), (3) of Subsection 2. 
Symbolically, T = L <S> L. 

The elements of the space T regarded as linear combinations oj 
the products of elements taken from L and L, that is, as (5) or 
(6), are termed tensors over L and L. The numbers in (5) are 
called the components of the tensor t relative to the basis e,e/, (or 
relative to the bases ei and ei, of the spaces L and £). 

5. We now show how it is possible to construct a map f with 
the properties (1), (2), (3) of Subsection 2. At the same time we 
will determine the degree of arbitrariness in this construction. 

Let us first assume that the map f is already given. Let e,- and 
Bh be arbitrary bases in L and £. Then by property (3) the product 
pairs e,e;, defined by the map / constitute a basis in T. Suppose 
we know that e,- e £, eu ^ L and e,e;, e T. Then we know the 
map f completely, that is, for any xe L and x e £ we know the 
product xic in the space T. Indeed, 

x = 'L x^Bi, x = x’^ek (7) 

whence and due to properties (1) and (2) 

XX = Yi x‘ei X x'^Bk = X x‘x'‘eie (8) 

Thus, if the map f exists, then it is uniquely defined by specifi¬ 
cation of arbitrary bases e,- and e* in £ and £ and by specification 
of the basis e, 7 , in T whose elements are pairs of the products et 
by e/i, that is, Bik = (these equations are to be understood in 
the sense that e,;, is an image of the pair e,-, ei, precisely under the 
map /). 

But it is easy to see that the desired map f will be found from 
these very same conditions. Indeed, let e, e £, eu e £ and 
Bih ^ T(i = I, ..., n-, kz=l, m). We regard each of these 
systems to be linearly independent in its space. We indicate e,-,, 
for the images of the pairs e,-, e/, with respect to the desired map f, 
i.e., we put = e, 7 ,. These equations can be ensured since the 
number of all e,- is equal to n and the number of all ei, is equal 
to m, while the number of all e,;, is equal to nm. Then we deter¬ 
mine f from (8) on an arbitrary pair x, .v, where x and x are given 
by (7). 

The properties (1) and (2), Subsection 3, are readily verified 
for the map f thus constructeii. Let us verify, say, the identity (2). 

Let // = X t/ei. Then 

{x+t/)x = '£ (.v' + ij‘) x’^eie,, = X x‘x'‘eie^ + 2 y'x'^eiei, = xx-{-yx 
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It is a bit more difficult to verify the third property (the nonsin¬ 
gularity of the constructed map f). Let us take any new pair of 
bases a, and di, in L and £. We have to demonstrate that the pro¬ 
duct pairs a,Oft (that is, the images of the pairs a,-, du in T) are 
linearly independent. We have 

whence 

We see that the vectors a..d^, are linearly expressible in terms of 
eidi,. Hence the rank of the system in the space T does not 

exceed the rank of the system eidu. 

But from (9) we get 

= CO) 

Here the quantities Qj are defined in standard fashion from 
Pr (see Section I). Q* are defined from Pt' in similar fashion. 
From (10) we conclude that the rank of the system e,A does not 
exceed the rank of the system Hence the ranks of these 

systems are equal. And since, by hypothesis, the system is in¬ 
dependent in T, so also is the system a^,dy independent (because 
it has the same rank, which is equal to the total number of vec¬ 
tors). 

To summarize, then, we have proved the existence of the maps 
we need and have completely elucidated the arbitrariness of their 
construction. 

6. Returning to the definition in Subsection 3, we conclude that 
we have defined L <S> L with exactly the same arbitrariness as 
there is in the choice of the map f. 

7. We denote by tp an arbitrary one-to-one mapping t' = (p(^) 
of the space T onto itself, this mapping being a linear isomorphism 
of the space (see Section 10, Chapter I). We denote by f' a compo¬ 
sition of the maps f and cp; symbolically, f' = cpf. This equation is 
to be understood as follows: first f carries an arbitrary pair a, d 
(a e £, a e L) into element t of space T and then cp carries t into 
t' = cp(0. 

Definition. The tensor products of £ by £ that have been estab¬ 
lished by means of the maps f and f' will be called isomorphic if 
f' = i\<f. where (p is some linear isomorphism of T onto itself. The 
tensors / and I' will be said to be corresponding under the given 
isomorphism of tensor products if /' = cp((). 
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In accordance with this definition, all the tensors constructed 
by means of f are mapped into tensors constructed with the aid 
of = In order to explain why a consideration of tensors con¬ 
structed by f is equivalent to a consideration of their images un¬ 
der an isomorphism, let us examine the situation from the arith¬ 
metical point of view; that is, we will examine the tensors in terms 
of components. 

Let (e.e*)'= (p (eiet), t = Then 

t' = Zt‘He,e,Y ( 11 ) 

where t'’’ are the very same numbers as in (5). Thus, under an 
isomorphism, the corresponding tensors have the same components 
relative to the appropriate bases. Thus, under an isomorphism, the 
only thing that changes is the representation of tensors in the 
form of certain elements of the space T. But the tensor compo¬ 
nents, and, hence, all equations referring to them in any kind of 
problem, remain unaltered. 

8. Finally, we offer the following proposition, the proof of which 
we leave to the reader: if / and f' are two maps of a pair of spaces 
L and L into the space T satisfying the conditions (1), (2), (3) of 
Subsection 3, then there is an isomorphism (p of space T onto it¬ 
self such that /' = (ff. 

From this follows the 

Theorem. All tensor products of a given linear space L into a 
given linear space L are isomorphic to one another. 
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§ I. Groups and subgroups. Distribution of bases into classes with 
respect to a given subgroup of matrices. Orientation 

I. Given a set G for the elements of which an equality (or ad¬ 
missible replacement, see Section 1, Chapter I) has been estab¬ 
lished; also given is an operation called multiplication. This ope¬ 
ration associates to every pair of elements a, b in G, taken in a 
specific order, an element c of that set. Symbolically we write 
c = ab and say that c is the product of a by b. It is assumed that 
the product ab is invariant to admissible replacements of the fac¬ 
tors a and b. 

Definition. A set G together with the operation of multiplication 
specified in it is said to be a group (relative to this operation) if 
the following axioms hold true. 

(1) For any a,b,c^G 


(ab) c = a (be) 


(2) There exists an element e e G such that for any a e G we 
have 


ae = a 


The element e is called the unit of the group. 

(.1) For any a e G there exists an xe G such that 

ax = e 


This element is called the inverse of a and is denoted by o"'. 

2. The following propositions follow readily from Axioms (I), 
(2). (3). 

(a) If ax = i\ I hen xa = e. 

Proof. By Axiom 3 there exists a y^G such that xy = e. On 
the other hand, if ax = e, then a = ae = a(xy) = (ax)y = ey, 
wheiue a = ey. Hence, xa = x(ey) — xy = e. 
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(b) ea = a for any a ^ G. 

Proof. By Axiom 3 and from what has been proved there exists 
an X such that ax = e and xa = e. Thus, 


ea = (ax) a = a (xa) — ae = a 


(c) If ax = e, ay = e, then y = x. 

Proof. We have y = ye = y(ax) = (ya)x =z ex = x 

The foregoing theorems signify that there is no necessity, in a 
group, to distinguish between left and right inverses and also 
between left and right units. Besides, in a group there is always 
uniquely defined an operation inverse to group multiplication; na¬ 
mely, the equation ax = b has the unique solution x = a-'b and 
the equation xa = b has the unique solution x = ba”'. This means, 
finally, that every group has only one unit. Indeed, U ae = a and 
ae* = a, then, as has been proved, e* = e. 

3. An important instance of a group is the set of all nonsingfular 
n X « matrices (either real or complex) together with the multi¬ 
plication operation defined in Section 2 of Chapter II. In the group 
of nonsingular n X « matrices the unit is the unit matrix E. The 
inverse of this nonsingular matrix is constructed in accordance 
with Subsections 4 to 8 of Section 3, Chapter II. We leave it to 
the reader to verify the first axiom of the group for multiplication 
of matrices (that is, associativity: (AB)C= A(BC)). 

The matrix example shows that group multiplication is, in ge¬ 
neral, not commutative (see Subsection 3, Section 2, Chapter II). 

4. A group is said to be commutative or Abelian if for any ele¬ 
ments a, b in the group we have ab = ba. Incidentally, in this 
case the group operation is frequently called addition and in place 
of ab we write a b. Then the unit of the Abelian group is called 
the zero element. 

Examples. (1) Every linear space is an Abelian group under 
the operation of addition of elements. This is clear since the first 
four axioms of a linear space coincide precisely with the three 
axioms of a group with the supplementary condition of commuta¬ 
tivity. 

(2) The set of all real numbers different from zero forms a com¬ 
mutative group under the operation of ordinary multiplication. The 
unit of this group is the number one, the inverse of X is the num¬ 
ber X"'. 

5. Definition. A subset G of elements of a group is called a 
subgroup if a e (3, b s G implies ab e G, and a ^ G implies 
a-' e G. 
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From this it follows, in particular, that e e (3. Thus, under these 
conditions the requirements of Axioms 1 to 3, Subsection 1, hold 
for Cl, and the subset G itself is a group under the same operation 
of multiplication as is given in the whole group G. The unit of the 
group G is the unit of every subgroup. 

6. Examples of subgroups. (1) In an arbitrary group G the unite 
constitutes a subgroup consisting of a single element. 

(2) The entire group G may be viewed as a subgroup of G. 

(3) If a linear space L is viewed as a group under addition, 
then any subspace of L is a subgroup. We suggest that the reader 
construct a subgroup in L that is not a subspace. 

(4) In the group of real numbers different from zero (see 
Example 2 of Subsection 4) all positive numbers form a subgroup. 

(5) In the same group there is another subgroup consisting of 
two elements; the numbers >. = 1 and A. = —1. 

(6) In the group of all real nonsingular n X « matrices let us 
consider the subset G consisting of matrices with a positive deter¬ 
minant. From the theorem on the determinant of a product of 
matrices (Chapter II, Section 3) it follows that G is a subgroup. 

Indeed, if A, fi e G, then det AB = det A-det 6 > 0, hence 
AB ^ G. If A e G, then det /1‘‘=(det A)"' > 0 and, hence, 
/4"' e G. 

(7) In the group of all real (or complex) nonsingular n X n 
matrices we consider the subset G consisting of all matrices whose 
determinants have absolute value = 1. It is easy to see that G is 
a subgroup. Indeed, if A, B e G, then | det AB | = | det A (• | det B | = 
= 1, hence ABeG; if A e G, then |det A-'| = | det A j-'= I, 
hence A’’ e G. 

7. Let a subgroup G be taken in the group of all nonsingular 
n X « matrices. We consider the n-dimensional linear space L 
(real if G consists of real matrices, and complex if the matrices 
are complex). In L take a basis ei, ..., and pass to a different 
basis 

e,- = Z P\.e, (I) 

provided that the coefficients p\’ constitute a matrix P of the sub¬ 
group G. 

We condense (1) to 

e' = Pe (la) 

regarding (la) as a matrix equation in which the elements of the 
column matrices c and e' are vectors and the elements of the 
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square matrix P are scalars: 


^1 




Pv ... 

P?' 

• 

, e' = 

• 

, p= 

. • • 

• . 

en 


en' 


p!,' .. 

Pi' 


By taking for P all possible matrices in G, we get a diversity of 
bases e' that form a certain class of bases, wliicli class we denote 
by S‘(e). We will say that the class ^(e) is generated by the 
basis e with respect to the given subgroup G. 

8. From the fact that G is a subgroup there follows an important 
theorem. 

Theorem. If some basis e' belongs to S‘{e), then the class ge¬ 
nerated by the basis e' coincides with S’(e). Symbolically: S’{e') — 
= jr(e). 

Proof. Let e" be an arbitrary basis. Suppose that e"^^(e'). 
This means that there exists a matrix P' ^ G for which e" = P'e'. 
On the other hand, e'e^(e). Accordingly, we have the matrix 
P^ G for which e' = Pe. Whence e” = (P'P)e. But since G is a 
subgroup and since P ^ G, P' ^ G, it follows that P'P e G. Hence 
e"^S'{e). Thus, every basis in ^{e') enters into that is, 

the class (e') is contained in the class S’(e). 

Now note that in the case e' = Pe, P ^ G, it is true that 
e = P~'e' with P~' e G (since G is a subgroup). In other words, 
if e' ^ S(e), then e (e'). Hence, in the preceding argument we 
can interchange S{e) and S’(e'). Therefore, the class Sie) is in¬ 
cluded in the class S(e'); hence S’(e') =zS’(e) and the theorem is 
proved. 

Remark. Since the choice of basis that generates a class is im¬ 
material within the framework of that class, we will henceforth 
write S in place of S’(e). 

9. From what was proved in Subsection 8 it follows that the set 
of all bases in L breaks up into classes with respect to a given 
subgroup G so that each basis is accommodated by exactly one 
class (two classes either do not have common bases or are com¬ 
pletely coincident). Each class S' is invariant with respect to a 
given subgroup G. This means that for any basis e^S and for 
any matrix P ^ G, e' = Pe ^ S. (This means that after a trans¬ 
formation by means of any matrix of the subgroup G, every basis 
of the class S remains in that class.) 

10. Example. Let L be a Euclidean plane (more precisely, a li¬ 
near space of vectors lying in that plane) and G a subgroup of 
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second-order real matrices whose determinants are equal in abso¬ 
lute value to 1. Then each class & consists of bases with one and 
the same area of the basis parallelogram (Fig. 28) (different clas¬ 
ses are associated with different values of this area). Indeed, let 

e^, = ae^ + p-o. = ve, -f 6?2 (2) 

Denote by S the area of the basis parallelogram for e\, 62 , by S' 
a similar area for ep, 62 '- From (2) we have 

S' = S|a6-pYl 

If the matrix P of the transformation (2) belongs to G, then 
|a6 — Pvl = 1 and 5' = 5. Conversely, if 5' = S, then P ^ G. 

II. Let L again denote an ra-dimensional linear space. We as¬ 
sume it to be real. Now denote by G the subgroup consisting of 
all n X « matrices with positive determinants. 



Fig. 28 


In L take an arbitrary basis e and construct a class ^{e) with 
respect to the subgroup G. Henceforth we denote this class by S’. 
It is obvious that the class S does not exhaust all the bases of the 
space L. 

Indeed, if P is an n X « matrix with negative determinant, then 
the basis c' — Pc docs not enter into S. Let us take such a basis e' 
and construct, relative to the subgroup G, a class S{e'), which 
we will denote by S'. 

We will now demonstrate that in this case there are no classes 
other than S and S'. For any basis e" of L there will be nonsin¬ 
gular matrices P' and P" such that e" = P"e and e" = P'e'. 
From the latter equation and from the relation e' = Pe we have 
e" = (P'P)e. Hence P" = P'P, whence det P" = det P'-detP. 
Since del P < 0, the determinants of the matrices P' and P" have 
different signs, which means that one of them is positive. If 
det P" > 0, then e" ^S\ if det P'> 0, then e"^S', which is 
what we wanted to establish. 
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12. Thus, ali the bases of the space L are divided, with respect 
to the subgroup G{P e G if det P > 0), into two classes. 

13. If two bases of L belong to one of these two classes, they are 
said to have the same orientation. Two bases are said to have op¬ 
posite orientations if they belong to different classes. 

The bases of one of these classes are said to be positively orient¬ 
ed {or right-handed)-, then the bases of the other class are said to 
be negatively oriented {or left-handed). Either one of the two clas¬ 
ses can be chosen as the class of positively oriented bases. If that 
choice has been made, then we say that an orientation has been 
specified in the space L. 

14. Quite naturally, it is not necessary to deal first with groups 
in order to express the concept of orientation of a space. This con¬ 
cept can be explained without involving groups in any way, which 
is what we will now do. 

Let Cu .... and eis ..., e,,- be two arbitrary bases of a 
space L. We have 

er = 'LP‘^e. (3) 

where the coefficients Pr constitute a nonsingular matrix P, that 
is, det P # 0. 

If det P > 0, then the basis e,' is said to have the same orien¬ 
tation as the basis e,-, if det P < 0, then the basis e,- is said to 
have the opposite orientation of the basis e,. 

15. We have the following propositions. 

(1) If the basis e,' has the same orientation as the basis eu 
then e,- has the same orientation as e,'. Indeed, by (3), the vectors 
e,' can be expressed in terms of e,- by means of the matrix P. Con¬ 
versely, the vectors e,- can be expressed in terms of t?,- with the 
aid of the matrix P‘‘. Thus det P"' = (det P)"' > 0. 

(2) If two bases have the same orientation as a third, then all 
three have the same orientation. Let the vectors e,- be expressed 
in terms of et with the aid of matrix P, let the vectors (?,- be ex¬ 
pressed in terms of ej with the aid of matrix P', and let det P > 0 
and det P' > 0. Then the vectors i.r - ex|',ressed in terms of ei' 
with the aid of matrix P'P*' and we get 

det (P'P'') = det P' • det P“' > 0 

(3) If two bases are oppositely oriented relative to a third, then 
these two bases have the same orientation. If det P <; 0 and 
det P' < 0, then det(P'P-') > 0. 
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16. In the space L choose an arbitrary basis e, and call it posi¬ 
tively oriented (right-handed). We say that any other basis is po¬ 
sitively oriented (or right-handed) if it has the same orientation 
as Ci. We say that any basis is negatively oriented (or left-handed) 
that has an orientation opposite to that of basis e,-. Thus all bases 
of L will be placed in two classes. By the three propositions proved 
in Subsection 15, any two bases of one class have the same orien¬ 
tation; any two bases taken from different classes have opposite 
orientations. The indicated classes have the same status, that is, 
any one of them can be chosen to represent the class of positively 
oriented bases. If that choice has been made (by indicating the 
basis e,), then we say that an orientation has been specified in the 
space. 

17. Note in conclusion that the concept of orientation is essen¬ 
tially connected with the fact that a basis is viewed as an ordered 
collection of vectors. If the numbering of the vectors of a basis is 
altered so that two vectors are interchanged while the remaining 
ones retain their number labels, then the orientation of the basis 
is reversed. Indeed, let the bases and e,' be connected by the 
relation (3) and let the appropriate change be made in the num¬ 
bering of the vectors e,-. Then in matri.\ P there will be an inter¬ 
change of two rows and, hence, the determinant of the matrix will 
change sign. 

§ 2. Transformation groups. Isomorphism and homomorphism 
of groups 

I. Suppose we have a set M of elements that we agree to call 
points. 

We say that a transformation is given of the set M if to every 
point jc in Af is associated a certain point y of the same set M. 
Symbolically, 

y = fix) 

Here, // is the image of the point x and x is the inverse image 
of y. 

Two transformations / and g are said to be equal if g{x)= f{x) 
for any point .v e Af. 

A transformation f is said to be one-to-one if every point y ^ M 
is an image of some unique point x e Af. In this case, the trans¬ 
formation which to an arbitrary point y = f{x) associates its 
inverse image x is said to be the inverse of the original transfor¬ 
mation / and is denoted by f~': 

y = f(x), x = r'{y) 
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The transformation e is called the identity transformation if 

e(x) = x 

for any point x in M. It is clear that the identity transformation 
is one-to-one and e”' = e. 

In the particular case where M is the number axis (line) 
— oo < T < -{- oo the concept of a transformation coincides with 
the concept of a function specified on the entire number axis. If 
the function t = f{x) has a single-valued inverse g(x), also speci¬ 
fied on the entire number line — ex? < t < -f oo, then this inverse 
function specifies an inverse transformation (symbolically: f-'=g). 

2. Let (p, f be transformations of the set M. 

The product of <p by / is said to be the transformation x given by 
the formula 

X(a:) = (p[/ (a:)] 

for any point x in Af. Symbolically we write x = <pf- 

When M is the number line, the product of the transformation 
f = (pft) by the transformation t = f{x) is the composite function 
i In general, a product of transformations is not com¬ 

mutative (for example, sinh®x #= sinh x®). 

For any transformation f of an arbitrary set M we have the ob¬ 
vious identities 

fe = ef = f (1) 


and if f is one-to-one, then 

r'f = e, fr'=e ( 2 ) 

Confining ourselves to one-to-one transformations, we note that 

if fg = e or gf = e, then g = r' (3) 

A product of transformations is associative, that is, 

= (4) 

for any three transformations f, cp, ij) of the set M. This is clear 
since each of the transformations (4) operates in accord with the 
formula ij = ■>lp[(p[f{x)]}. 

If the transformations cp, f are one-to-one, then both products <pf 
and f<f are also one-to-one, and 

(qp/)"' = f"V (5) 

The one-to-oneness of each of the transformations (pf and /qp fol¬ 
lows directly from the one-to-one character of qi and /; formula (5) 
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follows from (l)-(4) since 

irV') (<p/)=/■ '(<p" '^)f=r'ef=r'f=e 

3. From the definitions and properties stated in Subsections 1, 2 
it follows that all possible one-to-one transformations of a given 
set M constitute a group with respect to multiplication (composi¬ 
tion) of transformations. 

4. Definition. Any collection G of transformations of a set M 
is called a group of transformations of that set if G forms a group 
under multiplication of transformations. 

The third axiom implies that only one-to-one transformations 
can constitute a group of transformations. We can therefore say 
that every group of transformations of A4 is a subgroup of the 
group of all one-to-one transformations of the set. 

5. Throughout this chapter we consider only one-to-one trans¬ 
formations and frequently do not stipulate this condition. 

6. Let G be a collection of transformations of a set M. Since (5) 
holds for any three transformations, G will be a group if: 

(a) from the fact that two transformations f, q? belong to G it 
follows that f(f^G and (pf e G; 

(b) from the fact that a transformation f belongs to G follows 
the existence and membership in G of the inverse transforma¬ 
tion /■*. 

From this now follows the membership in G of the identity 
transformation e; it is the unit of the group G (in this connection 
see formulas (1) and (2)). 

7. By way of an important example we take the group of all 
nonsingular (real or complex) linear transformations of n variab¬ 
les. In this case the set M is an n-dimensional coordinate real or 
complex space. That nonsingular linear transformations constitute 
a group was actually shown in Section 3 of Chapter II. There it 
was demonstrated that the product of nonsingular linear trans¬ 
formations is a nonsingular linear transformation, and the inverse 
of a nonsingular linear transformation is a nonsingular linear 
transformation. Thus are observed the conditions (a) and (b) of 
Subsection 6. 

8. Let G, G' be two groups and let the group G be mapped 
onto G'. Let us agree to use primes to indicate images: for 
example, a' e G' is the image of the element a^ G. 

Definition. A one-to-one mapping of G onto G' is called an iso¬ 
morphism if the image of a product is equal to the product of the 
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images; symbolically, 

{(ib)' = a'b' (6) 

We now prove that under an isomorphism the image e' of the 
unit e of G is the unit of the group G'. Indeed, let a' be any ele¬ 
ment of G'\ it corresponds to an elcmenl a of group G. We thus 
have ae = a and so, by the definition of an isomorphism, 

a'e' = (aey = a' (7) 

or e' is the unit of G'. 

If an isomorphism exists from G to G', then the groups G and 
G' are said to be isomorphic to each other. Under an isomorphism, 
all the relations between elements of one group carry over to the 
other group. Therefore from the viewpoint of group theory the two 
isomorphic groups have the same structure. It is enough to study 
one in order to know the other. 

Examples. (1) Let G be the group of all real nonsingular ny. n 
matrices and G' the group of all real nonsingular linear transfor¬ 
mations of n variables (viewed as transformations of coordinate 
space Kn)- We associate with an arbitrary matrix /4 in G a linear 
transformation in G' having the matrix A. In this fashion, the 
group G is mapped one-to-one onto G'. This mapping is an iso¬ 
morphism since the transformation with the matrix AB is the pro¬ 
duct of transformations having matrices A and B. Under this iso¬ 
morphism, all group relations, in particular, all subgroups, carry 
over from G to G'. For instance, the subgroup of matrices in G 
whose determinant has absolute value one is associated with a 
definite subgroup in G', which consists of linear transformations 
with determinant having absolute value one. Later on we will 
examine certain other important subgroups in G and G' that cor¬ 
respond to one another. 

(2) If the linear spaces L and L' are linearly isomorphic, then 
they are also isomorphic as groups. Generally speaking, the con¬ 
verse is not true. Indeed, from Sections 10, 11 of Chapter 1 it fol¬ 
lows that an n-dimensional complex space C„ and a real space L 211 
of dimension 2n are isomorphic as groups (relative to the opera¬ 
tion of addition of vectors), yet at the same time they are not li¬ 
nearly isomorphic spaces. 

(3) Using Subsection 6 , we can easily verify that the collection 
of linear functions t = Xx for all possible X ¥= 0 forms a group 
of transformations of the number line —cx><T<-f 00. Denote 
this group by G. It is called the group of linear transformations 
of the number line. Let G' be the group of real numbers A, (A, = 5 ^ 0) 
under multiplication. Associating to every transformation t = Xx 
a number X, we get an isomorphic mapping of G onto G', which is 
a special case of Example (1) when n = 1. In Subsection 6 of 



190 


GROUPS AND SOME APPLICATIONS 


(CH. VI 


Section I we indicated two subgroups in G'. Associated to them 
in G are two subgroups defined by the conditions: (1) X= ± 1, 
( 2 ) > 0 . 

The first of these subgroups consists of only two transforma¬ 
tions: the identity mapping of the number axis / = t and the re¬ 
flection t = — T. 

The second subgroup consists of an infinite set of transforma¬ 
tions, namely, of all linear transformations preserving the direc¬ 
tion of the number axis. 

9. Definition. A mapping of a group G into a group G' is called 
a homomorphism if the image of the product of any two elements 
of G is the product of their images in G'. 

In other words, only condition (6) must be obeyed. Note that it 
may happen that a' = b' when a ^ b and some elements of G' 
are not images of any elements of G. We symbolize a homomor¬ 
phism by G G'. 

It is clear that an isomorphism is a special case of a homo¬ 
morphism. 

Another special case of a homomorphism is obtained if we asso¬ 
ciate to each element of an arbitrarily chosen group G the unit of 
some group G'. Then we have: a' = e', 6' = e', (ab)' = e' = a'b', 
and (6) holds. 

10. Theorem. Under any homomorphism G-»- G\ the image of 
the group G is a subgroup of G'. 

Proof. Denote the image of G by G. Let a', b' be arbitrary ele¬ 
ments in G, and a, b certain of their inverse images. By (6), G is 
closed under multiplication: 

a'b' = (ab)' s G (8) 

From (6) it also follows that 

a'(a~')'= (aa~')'= e' (9) 

As in Subsection 8 (see formula (7)), it is established that the 
image e' of the unit element of the group G is the unit of G'. The¬ 
refore (9) signifies that 

(a')-' = (a-r^G ( 10 ) 

The relations (8) and (10) show that G satisfies the definition 
of a subgroup. 

Remark. It can be demonstrated that the collection of all in¬ 
verse images of the unit element of the group G' under the homo¬ 
morphism GG' forms a subgroup of G called the kernel of the 
homomorphism G -*■ G'. We will not dwell on the proof. 
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11. We now consider some examples of homomorphisms that will 
come in handy in the future. 

Let G be the group of all nonsingular real n X n matrices, G' 
the group of real numbers X (A, ^ 0) under multiplication. We con¬ 
struct the following mapping of G into G': 

(1) to each matrix /4 e G is associated the same number A, = 1; 

(2) all matrices for which det 4 > 0 map into the number 
A, = +1; all matrices for which det /I <: 0 have as their image the 
number A = — 1; 

(3) let a be any fixed real number; to every matrix /4 e G is 
associated a number = |det /I I"; 

4) to the matrix A is associated A = |det /1|® if det /I >0, and 
A = — I det /I I ° if det /4 < 0. 

We have the homomorphism G-► G' in all four examples. In the 
first example this occurs because the entire group G is mapped 
into the unit of the group G', in the other three cases because of 
the theorem on the determinant of a product of matrices. 

Instead of the group of numbers A (A 0) we can take the iso¬ 
morphic group of linear transformations t = Kx {K 0) of the 
number line. Then we get four homomorphisms in which the 
images of G are groups of transformations of the number line con¬ 
sisting, respectively, of 

(1) the single identity transformation: / = t; 

(2) two transformations: t — x and t = —t; 

(3) all transformations t = Xx for which A > 0; 

(4) all linear transformations f = At (A = 5 ^ 0). 

It turns out that the above four types of mappings of G and G' 
given in this subsection exhaust all possible homomorphisms of G 
into G'. This assertion will be made essential use of in the next 
section (see Section 3, Subsection 8), where hints with respect to 
the proof will be given. 

§ 3. Invariants. Axial invariants. Pseudoinvariants 

1. We have an n-dimensional linear space L, which we assume 
to be real. This is done to simplify further formulations. Suppose 
in L we choose a fixed class of bases S' with respect to some sub¬ 
group G of nonsingular (real) « X ” matrices. 

Besides L we consider a set T, the specific nature of the ele¬ 
ments of which is immaterial. Actually, for T we will have collec¬ 
tions of geometric entities of the space L or some kind of al¬ 
gebraic entities connected with this space. 

Let there be given a numerical function a of two arguments — 
an arbitrary element t of the set T and an arbitrary basis e taken 
from the class S: 


a = il)(/, e) 


(1) 
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Suppose that the values of the function (1) are real and, for all 
possible t e T, fill the entire number axis — oo < a < + oo (fof 
any fixed e). 

We can imagine (as is frequently done in geometry) that the 
right member of (1) is a symbolic notation for a function of the 
coordinates of entity t. Accordingly, instead of (1) we can write 

a = Tl3(A:,, X 2 , ..., Xfj) (la) 

where Xi, Xi, ..., Xn are the coordinates of t relative to the basis 
e, that is to say, numbers which in some way determine the entity 
t when the basis e is specified. 

Example. The entity t is a parallelogram constructed in a Eucli¬ 
dean plane on an ordered pair of vectors p, q: 

P = {Xi,X2}, q = {yi, y-A 
For (la) we write (in this case) 

a = ■ti.V2 — ^21/1 

Here it is assumed that Xi, X 2 and y\, y 2 are the coordinates (com¬ 
ponents) of p and q relative to a basis e. Then Xi, X 2 , y\, y 2 may 
be taken to be the components of t in the same basis. If e is taken 
in the class of orthonormal bases of a Euclidean plane, then the 
number a will be an oriented area of the parallelogram t. 

2. Suppose we have the value a = \|)(/, e) and suppose that we 
pass from the basis e to any other basis e' of the same class S’: 

e'=-Pe, PeG 

We will require that the number a' = e') be defined by spe¬ 

cification of the number a and the matrix P without any more in¬ 
formation about the entity t and the original basis e. In other 
words, we assume that a' is a function of a and of the elements 
Pr, of the matrix P = ||Pi'|l. Symbolically, 

a' = f{a, P) (2) 

Wi‘ also assume that for every matrix P in G the function (2) 
specifies a one-to-one transformation of the number axis — oo <; 
< a < + oo, which transformation we indicate by the symbol fp. 
In place of (2) we will often make use of the equivalent notation 

a' = fp (a) 

3. When conforming to the requirements of Subsections 1 and 2, 
we will say that a scalar quantity a has been specified in the 
space L on the set T relative to the group G. Formula (2) is cal¬ 
led (he law of transformation of the scalar quantity a. 
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4. The function f{a, P) cannot be chosen arbitrarily. The condi¬ 
tions of Subsection 2 impose rigorous restrictions, the essence of 
which consists in the transformations a' = fp{a) constituting a 
group. More precisely, we have the 

Theorem. The law of transformation of a scalar quantity is a 
homomorphism of the group G of matrices into the group of all 
one-to-one transformations of the number line. 

Explanation. Formula (2) indicates that to every matrix P^G 
is associated a transformation a' = fp(a) of the number line 
— oo <; a < -j- CX3. The theorem asserts that this correspondence 
is a homomorphic mapping. 

Denote by H the set of transformations a' = fp{a) correspond¬ 
ing to all possible matrices P in G. Then there follows from the 
theorem and from Subsections 4 and 10 of Section 2 the 

Corollary. The set H is a group of transformations of the num¬ 
ber line — oo <; a < + oo. 

Proof of the theorem. Since the one-to-one nature of each trans¬ 
formation a' = fpia) is given, and all one-to-one transformations 
of the number line — oo < a < -j- oo constitute a group, it suffices 
to verify that 

fp'fp = fp'p (3) 

for any matrices P, P' in G. Take an arbitrary basis and 

consider the bases e'= Pe, e" = P'e' ={P'P)e. We denote by 
a, a', a” the values of the scalar quantity (1) relative to the 
bases e, e' and e", respectively. By Subsection 2, we have 

a" = f(a', P') = f{a, P'P) (4) 

From (2) and (4) we get the condition imposed on the function fi 

fifia, P), P') = f{a, P'P) (5) 

To the right member of (5) corresponds the transformation fp’p. 
The composite function on the left of (5) is associated with a 
transformation equal to the product fp'fp. Therefore relation (5) 
(where a is any scalar and — cx) < a < + oo) is equivalent to 
(3). This completes the proof. 

5. Remark. The truth of the theorem may be demonstrated by 
somewhat more pictorial reasoning. Let the matrix PeG yield 
a transition from the basis e to the basis e', and the matrix P' ^ G 
a transition from the basis e' to the basis e". Then the matrix 
P'P ^ G yields a direct transition from e to e". Then we recalcu¬ 
late the values of our quantity by proceeding from e and going 
over to e"\ once via e', the next time directly. If fp’fp ^ fp'p, then 
we get different results, which is inadmissible since the value of 
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the quantity must be uniquely defined for each entity in T relative 
to each basis of the class S. Hence fp'fp = fp'p, which is what we 
wanted. 


6. Every homomorphism of a group G into any group of trans¬ 
formations of the number line specifies a scalar quantity in the 
space L. 

Let us go into this matter in more detail. Let a homomorphism 
associate to a matrix P e G a one-to-one transformation fp of the 
number line, that is, a function a' = fp(a) specified on the entire 
line and having an inverse function also specified on the entire 
line. Set f(a, P) = fp{a). Then from (3) follow (4) and (5), 
whence it follows that a is uniquely determined in all bases of the 
class S’. 

We have to construct a set T, that is, we have to determine the 
geometric entities t on which a scalar quantity a would be speci¬ 
fied with a given law of transformation /(a, P). This can be done 
in different ways. For instance, for t we can take a point of the 
number line with coordinate a on the ordinary Cartesian scale. At 
the same time we must assume that an arbitrary basis e has been 
chosen in the class S’. In going over to a new basis e' = Pe, we 
pass to a different scale on the number line by the formula 
a' = fp{a). We will assume that, relative to the basis e', the same 
point t is associated with its coordinate a' on the new scale. Then 
all requirements of Subsections 1, 2 will be complied with. 


7. With particular frequency we encounter so-called linear geo¬ 
metric entities or linear scalar quantities, which are characterized 
by the fact that the transformations /p are linear, that is, the trans¬ 
formation law (2) is of the form 

a'==t{P)a 

In this case, in place of (5) we have a simpler relation: 

fiP)f{P') = f{PP') (6) 

imposed solely on the matrices P, P' (any matrices in G). 

The relation (6) can readily be derived at once without resorting 
to (5). Indeed, if 

a' = f(P)a a" = /(P')a' 

then 

a" = f(P)f{P')a 
On the other hand, we must directly have 

a" = f{PP')a 


Thus, (6) is fulfilled. 
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8 . Now let G be the group of all real nonsingular n X « matri¬ 
ces (n fixed). 

As has already been mentioned (without proof) in Subsection 11, 
Section 2, all homomorphisms of G into the group of real numbers 
(under multiplication) reduce to four types of mappings that are 
listed in Subsection 11, Section 2. 

This assertion can be expressed as follows. 

If a numerical function f(P) of a matrix argument satisfies (6) 
for any P, P' e G, then: 

(1) either /(P) = 1 for all P e G; 

(2) or /(P) =1 for P e G and det P > 0, and /(P) = — 1 for 
P e G and det P < 0; 

(3) or /(P) = I det PI” for all PeG (here o is a given real 
number); 

(4) or /(P) = ± I det PI", where the plus sign occurs for P e G, 
det P > 0, the minus sign for P e G, det P < 0. 

Note that the cases (3) and (4) include, in particular, the cases 
(1) and (2) when o = 0. Note also that we do not consider the 
trivial case where /(P) is identically zero: f(P)= 0 for all P ^ G. 

The proof of this assertion can only be given with the help of 
material taken from subsequent chapters, and so we give the proof 
in a special appendix (see Appendix 1). 

However, we will make use of the assertion at once. It will 
enable us to enumerate all possible types of linear quantities. Na¬ 
mely, there exist only the following four types of linear scalar 
quantities. 

(1) Invariants, which are quantities that do not depend on the 
choice of basis. Their law of transformation is 

a' = a (I) 

for any matrix P. The group H consists of a single identity trans¬ 
formation of the number axis. 

In the preceding chapters we have assumed all along that we 
are dealing with such quantities (for instance, when we considered 
linear, quadratic, bilinear and multilinear forms). 

(2) Axial invariants. Their law of transformation is 

( a if det P > 0 
- a if det P<0 

Here the group H consists of two linear transformations a' =* a 
and a' = — a. 

The name “axial invariants” expresses the dependence of these 
quantities on the orientation of the coordinate axes. They do not 
change when passing to a new basis with orientation preserved, 
but they change sign if the orientation of the basis is reversed. 


7* 
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Example. An instance of an axial invariant is the oriented area 
of an oriented parallelogram of a Euclidean plane. This quantity 
is positive if the pair of vectors defining the parallelogram and its 
orientation have the same orientation as the basis; otherwise it is 
negative. The elements of the set T are all possible oriented paral¬ 
lelograms on the Euclidean plane. 

(3) Pseudoinvariants of weight o. Their law of transformation 
is 

a' = a\detPf (III) 

where o is a given real number. 

Here, to each matrix P is associated a linear transformation 
a'= la for X = |det P|". The group H consists of all linear 
transformations a' = la with positive coefficient 1 . 

Example. By Section 3 of Chapter IV, the determinant of the 
matrix of an invariant bilinear form transforms by the law 

A' = A (det Pf 

Thus the quantity A is a pseudoinvariant of weight a = 2. The 
set T in this example consists of all possible invariant bilinear 
forms specified in the space L. 

(4) Axial pseudoinvariants of weight a: 

I aldetPr if detP> 0 
a' = ] ' (IV) 

( -aldetPr if detP<0 
where a is a given real number. 

9. If G is some group of real n X n matrices, then the scalar 
quantities with transformation laws (I), (II), (III) and (IV) are 
called, respectively, invariants, axial invariants, pseudoinvariants 
of weight a, and axial pseudoinvariants of weight a with respect 
to the group G. In the case of the group of matrices for which 
|detP| = l, pseudoinvariants do not differ from invariants and 
axial pseiidoinvariants do not differ from axial invariants. 

But if wc impose the condition det P = -f- 1. then all four clas¬ 
ses of quantities become indistinguishable (they reduce to inva¬ 
riants with respect to the indicated group). 

Note in passing tiiat the group of matrices with determinant 
equal to unity is ordinarily called the unimodular group (both in 
the real and the complex case). 

10 . The term “invariant” is often used in a broader sense than 
that of Subsections 8, 9. 

Namely, the invariants of a group of transformations are all 
entities, properties, and quantities that are preserved under any 
transformation of the given group. 
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It is obvious that every invariant of a group is an invariant for 
any subgroup of that group. The converse is not true: an invariant 
of a subgroup may not be an invariant of the entire group. In this 
sense we can say that the broader the group of transformations, 
the smaller number of invariants it has, but such invariants reflect 
the most stable and profound properties of reality. 

Geometry breaks down into a number of divisions, in each of 
which one investigates the invariants of a definite group of trans¬ 
formations of some space. For example, in elementary geometry 
we consider in three-dimensional Euclidean space the properties 
of figures that are preserved under any motion of the figure as a 
rigid body (in other words, invariants of the group of motions of 
three-dimensional Euclidean space). In the chapters that follow 
we will examine several important groups of transformations and 
some of their invariants. 

§ 4. Tensor quantities 

1. We now define certain classes of quantities allied to tensors 
and including the latter as a special case. We will not attempt to 
interpret these quantities geometrically. All we plan to assume is 
that relative to every basis they are specified by a specific set of 
numbers called components (coordinates) and that when passing 
to a new basis these numbers (components) transform just as the 
coefficients of multilinear forms. So as not to complicate our des¬ 
cription with cumbersome formulas, we will suppose that the com¬ 
ponents (the coordinates of the quantities) are equipped with two 
indices (one lower and one upper). Accordingly, we will consider 
forms of two vector arguments (one contravariant and the other 
covariant). The passage to any larger number of indices is trivial. 

2. Given in an n-dimensional linear space L a bilinear form 
a(x, u), X ^ L, u ^ L*. U L has a basis ^i, ..., e„, and L* has a 
reciprocal basis e', ..., e”, then x = x'e\ -f ... -f- u = -f 

Un^" and the form a{x,u) can be represented component¬ 
wise (in coordinates): 

a{x, u)== YjO^x^Uk 

where 

a’l = a{e^, e*) (I) 

When passing to a new basis we have 

ei’=Y.P\'^i^ ( 2 ) 

The coefficients of the component representation of the form do 
not change in the process. The new coefficients a\' will be expres- 
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sud in terms of the old coefficients in one way or another, de¬ 
pending on tlie nature of the form itself as a scalar quantity. Na¬ 
mely, it may happen that the numerical value a{x, u) of the given 
form on an arbitrary pair of vectors x, u will be replaced by a new 
value a'{x, u). Then the law of transformation of a(x, u) into 
a'{x, u) determines the law of transformation of af into a*'. We 
assume that the given form, being a scalar quantity, belongs to 
one of the four classes indicated in Subsection 7 of this section. 
We accordingly consider four cases. 

3. (1) The form a{x, u) is an invariant. In this case 
aj, =n'(e.,, e*') = a(ej,, e*') 

whence, with account taken of (1) and (2), we get 

fli' = Y,atP\'Qk (I) 

Thi^ is the familiar law of transformation of the components of 
a mixed tensor of second order. 

(2) The form a(x, u) is an axial invariant. In this case 

af,=a'{e,„ e'‘’) = ±a(e,„ e>^') 

where we have the plus sign if det P > 0 and the minus sign if 
det P < 0, whence, taking into account (1) and (2), 

al- = ± Za'Pl'Qt' (II) 

with the same condition regarding the sign in the right-hand 
member. 

Quantities whose components transform by law (II) are 
called axial tensors. 

(3) The form a(x, u) is a pseudoinvariant of weight a. In this 
case 

flr = a' (er, e'‘ ) = a (er, e* )| det P I*’ 
whence, taking into account (1) and (2), 

a'!^'=jdetPfZaiPrQt' (III) 

Quantities whose components af transform by law (III) are 
called pseudotensors of weight o. 

(4) The form a{x, u) is an axial pseudoinvariant of weight a. 
In this case 

ar =a‘(er, e*') = ±a(er, e*')|detPr 
and, accordingly, 


af: = ± I det P r Za’‘P\ qX 


(IV) 
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where on the right we have plus if det P > 0 and minus if det 
P < 0. 

Quantities with components that transform by law (IV) are 
called axial pseudotensors of weight a. 

4. The transformation laws (I)-(IV) for u'f were derived as 
a consequence of the corresponding transformation laws (I)-(IV), 
Section 3, for the scalar quantity a(x, u). Clearly, the converse is 
also true: ifuij transform by the laws (I)-(IV), then for the scalar 
quantity 

a(x, u)= Y.aU‘uk 

we have, respectively, the transformation laws (I)-(IV) of Sec¬ 
tion 3. 

5. In this section we assume that the transformation (2) is de¬ 
termined by any nonsingular matrix P. We can suppose that the 
matrices P are taken from some group G while the admissible 
bases constitute the appropriate class Jf. Then the foregoing defi¬ 
nitions give us four classes of tensor quantities under the group G. 

6 . We will view the collection of numbers a^ as a point in the 
coordinate space K (of dimension n^). Then any one of the four 
laws (I)-(IV) defines, via the given matrix P^G, a certain 
transformation of the space K (naturally, the appropriate trans¬ 
formation for each law (I)-(IV)). We denote this transformation 
for any one of the laws (I)-(IV) by fp. The following statements 
hold true. 

(a) The set of all fp (P e G) is a certain group H of transfor¬ 
mations of the space K. 

(b) The map of G onto H under which the matrix P e G is as¬ 
sociated with the transformation fp ^ II is a homomorphism, 
namely, 

fp'fp — fp'p (3) 

for any matrices P', P ^ G. 

The proof is similar to that of Subsection 4, Section 3. 

Relation (3)is very important. If it were not true, then the laws 
(I)-(IV) would be meaningless, since distinct transitions to a 
new basis (either directly or via intermediate bases) would yield 
unlike results. 

7. From the transformation laws (I)-(IV) it follows that if all 
the components of a tensor quantity vanish in one basis then they 
vanish in any other basis: if a* = 0, then aj) = 0. 
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8. If the components of a tensor quantity (of any one of the 
four classes) are labelled with many indices, then subscripts are 
used for those over which, in the transformation laws (I)-(IV), 
the summation is with the upper indices of elements of the 
matrix P. The number of lower indices is called the order of co- 
variance of the tensor quantity. The remaining indices are written 
as superscripts; their number is equal to the order of contrava- 
riance. 

Remark. The scalar quantities of any one of the four classes 
mentioned in Section 3 may be viewed as tensor quantities (of the 
appropriate class) of order zero. 

9. Let T denote the set of all tensor quantities of one of the 
classes (I)-(IV) of the same type (that is to say, with the same 
number of upper and lower indices). Now, if linear operations are 
performed on the elements of T in coordinate space (that is, if we 
construct the sum of some quantities by adding their appropriate 
components, and the product of a quantity by a scalar via multi¬ 
plication of all components by that scalar), then we get tensor 
quantities of the same set T. This is evident if, for the sake of sim¬ 
plicity, we take for T the set of mixed pseudotensors of order two 
of a given weight a. We consider two pseudotensors in T with 
components at and bf relative to a certain basis. In the new basis 
we get a/', br. We can assume that a* are expressed by the 
equation (III) of Subsection 3. Similarly, 

b1’^=\detP\°T.btp\'Qt' (Ilia) 

Adding (III) and (Ilia) termwise, we get 

+ 6?; = I det P r E (a? + 6?) P‘rQt' 

Thus, the sum of two tensor quantities of T have exactly the 
same transformation law as each one of the quantities. Multiply¬ 
ing lioth sides of (III) by an arbitrary scalar a, we see that oaf, 
is expressed in terms of oaf by the same law. 

10. If two tensor quantities belong to T, then their equality is of 
an invariant nature. Let’s look at this in more detail. Suppose, for 
instance, for af and 6? we have, relative to a single basis, the 
equations bi =ai for arbitrary i, k. Then relative to any other 
basis, br=a^'. This is evident from the fact that, by Subsec¬ 
tion 9, the difference bt—a'l is a tensor quantity and, hence, its 
vanishing does not depend on the basis. 

Remark. Of course, in a single basis, the equation bf = a^ is 
possible for any quantities af, But if these quantities are 
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taken from different classes (I)-(IV), that is, if they have differ¬ 
ent laws of transformation, then the equation breaks down when 
passing to new bases. 

11 . A product of two arbitrary tensor quantities taken from any 
one of the classes (I)-(IV) is constructed by multiplying each 
component of one quantity by each component of the other (re¬ 
lative to the same basis). The resultant quantity will belong to 
one of the classes (l)-(IV), depending on the choice of factors. 
For example, if at is a pseudotensor of weight a\ and bf is a 
pseudotensor of weight 02 , then a^bf will be a pseudotensor of 
order four and of weight 01 + 02 . Indeed, under these assump¬ 
tions, 

a1' = \detPf'ZaiPrQt', 

bf =\det prZbTP'rQ’:^' 

Multiplying these equations together, we get 

=1 det P f''^°'Za'lbTP‘rP'rQkQm 

12 . Note that the product of two axial tensors is an ordinary 
tensor. The product of an ordinary tensor by an axial tensor is an 
axial tensor. 

13 . Contraction on a single upper or a single lower index of a 
tensor quantity of one of the four classes (I)-(IV) yields a ten¬ 
sor quantity of the same class. A complete contraction reduces the 
tensor to a scalar quantity of the same class. For instance, the 
contraction of a mixed pseudotensor of weight 0 and of order two 
is a pseudoinvariant of weight o. Indeed, from (III) we have 

Z aS' = I det P r S ( 0 ? Z = I det P r E = I det P f Z aS 

Thus, for the quantity a = ^a% we have a transformation law 
of type (III), Section 3. 

§ 5. The oriented volume of a parallelepiped. 

The discriminant tensor 

1 . A basis ei, ..., e„ is chosen in an n-dimcnsional linear 
space L, which means an orientation of the space is given (see 
Section 1). In L we take an ordered n-tuple of arbitrary vectors 
xi . Xn and expand each one with respect to the given basis: 

X, =xje, -f ... 


= ••• 
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By X we denote the matrix made up of the coefficients of these 
expansions, that is, the components of the vectors Xi, ..., Xn rela¬ 
tive to tlic basis Ci, ..., 

With the ordered n-tuple of vectors xu ..., x„ we associate a 
number D{xi .jf„) equal to the determinant of matrix X: 


D{x .. x„) = 

Changing to a new basis 


-1 • 


•^1 


ei 




et 


(I) 


we associate with the same n-tuple of vectors Xi, Xn a num¬ 
ber D'{xu ..., x„) equal to the determinant of the matrix X' made 
up of the components of the vectors x\, ..., Xn relative to the basis 

It is easy to find the law of transformation of the quantity D 
(xi, ..., Xn). Namely, together with (1) we have the following 
equations for the components of any vector: 

x‘ = X Qi 


whence we get the matrix equation 

X' = XQ' 


Hence 

D'iXi .A:„) = detr = detA’detQ* = detQ‘D(A:„ .... 

But, as we know, Q* = P-\ and so we have the relation 

D'{Xi .x:„) = ±| detPr'D(;Ci, ..., Af„) 




( 2 ) 


where on the right we have plus if det P >0 and minus if 
det P < 0. 

We sec that D (jrj. x,,) is an axial pseudoinvariant of weight 

a = —I, wliich is defined on all ordered n-tuples of vectors. Note 
that D (xi, ..., x„) > 0 if the vectors Xi .jCn are linearly inde¬ 

pendent and tlie ordered n-tuple Xi, ..., x„ is positively oriented 
(that is, the orientation is the same as that of the basis ei,..., e„). 


2. From the properties of a determinant it follows immediately 
tliat D (a:i, ..., x„) is a multilinear form, that is, a function linear 
in eacli vector argument. 

Tlie form D (a,, ..., x„) is skew, that is, it is skew-symmetric 
in any pair of arguments (since the determinant changes sign un¬ 
der an interchange of two rows). 
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Expanding the determinant by its definition, we get a component 
representation of the form D{xu x„) relative to the basis 

6\, . . , , 6n' 

D{x, . = ... i^x\' ...x'" 

Here, — ^ there are any idoiilioal indices from among 

/i, ..., = + 1 if /|, ..., /„ constitute an even permu¬ 
tation of the positive integers 1 , 2 , ..., «; and = —1 if 

the permutation of ..., i„ is odd. 

From this and from the preceding subsection it follows that 
6 /,... is an axial covariant pseudotensor of weight a = —1. It 
is also clear that 6i^ ... is skew-symmetric with respect to any 
two indices. 

Remark. It is possible to show directly that if we subject 
to a transformation via a purely covariant law of type 
(IV), Section 4, we get the numbers 


6. . ,^=±|detP|7'Z6, 

1 n 1 



(3) 


which are exactly the same as 6,. ^ . Namely, 6^' ,'=0 if 

I **’ n I ** ft 

there are any identical indices; 6 ^' ,' = ±1 depending on the 

I ft 

parity of the permutation of i\ ... To illustrate, note only 
that if i\=\y 12 = 2 , ..., in = n, then the sum in the right mem¬ 
ber of (3) is equal to detP. Therefore 6 i' 2 '...«' =-j-!• The 
other cases are left to the reader. 


3. Using the form D (xi, ..., x„), we can construct a new skew- 
symmetric form of X|, ..., Xn, which will then be an axial inva¬ 
riant, that is to say, it will react to the orientation of the basis in 
sign alone while preserving its absolute value in all bases. But to 
do this we will have to invoke a certain invariant quadratic form. 

Let us take at pleasure some invariant quadratic form a(x, x) 
with the sole proviso that it be nonsingular. Relative to an ar¬ 
bitrary basis e\, ..., this form has a definite component repre¬ 
sentation and together with it a definite matrix A, where A = 
= det/I = 5 >^= 0. When changing to a new basis by (1), the form 
a(x,x) receives a new matrix^'. If A' = det/l', then, as we know, 


whence 


A' = A (det Pf 
V|'aT = V|“M I det pi 


( 4 ) 
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Multiplying (2) and (4) termwise, we find 

Vl A' I D' (a:, .A„) = ± Vl A I £> (a,, .... A„) 

Thus, the skew-symmetric multilinear form V|A|Z)(ai .*n) is 

an axial invariant. 

4. From this it follows immediately that relative to an arbitrary 

basis e\ .the numbers 

are the components of an axial tensor. It is called the discriminant 
tensor of the form a{x, x). Considerable use will be made of the 
discriminant tensor in Chapters VIII to X. 

5. Let 91 be a real n-dimensional affine space corresponding to 
an n-dimensional linear space L. 

Take an arbitrary point /I e 91 and arbitrary vectors Xu .... 
Xn s L, the number of them being equal to the dimension. Then 
letAf be a point of the space 91 defined by the equation 

AM = T,A, -f . . . -f T„A„ 

where ti, ..., t„ are real numbers. If ti.t„ vary independently 

of one another under the conditions 0 t* ^ 1 (^ = 1, ..., n), 
then all possible resulting points constitute a spatial figure which 
is called a parallelepiped constructed on the vectors Ai, ..., a„ ap¬ 
plied to the point A. For n = 2, the parallelepiped is called a pa¬ 
rallelogram (see Section 8, Chapter III). 

Let the space L be oriented by specification of the basis 

. .. Then if the vectors Ai, ..., a„ are linearly independent, 

the parallelepiped constructed on them is assigned a positive or 
negative orientation. The parallelepiped is said to have a positive 
orientation if the ordered n-tuple of vectors A|. a„ is positi¬ 

vely oriented. 

6. We wish to associate with every parallelepiped a number, 
which, by analogy with three-dimensional Euclidean space, it 
would be natural to term a volume (in the two-dimensional case, 
an area). Taking into account this analogy, we make the following 
requirements concerning the desired quantity. 

(1) The volume must depend solely on the vectors xi .a„ 

and not on the point A. 

(2) The volume must be a positive number in the case of a posi¬ 
tive orientation of the parallelepiped, and a negative number in 
the case of a negative orientation, and must be zero if the vectors 
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x\,..., Xn are linearly dependent (in which case the entire paralle¬ 
lepiped lies in a hyperplane). 

(3) .The absolute value of the volume must be an invariant. 

(4) Increasing the length of one of the vectors Xu ..., x„ a-fold 
increases the volume a times. 

(5) If Xf = x'j-{- x", then the volume of the parallelepiped con¬ 
structed on the vectors xi, X2, ..., x„ must be equal to the sum of 
the volumes of the parallelepipeds constructed on x\, x.^, x^ 
and x", x^, .... x^. A similar property must hold true relative 
to the other vectors of the n-tuple X], ..., x„. 

It turns out that the indicated requirements actually determine 
the volume as a function of xu .... jc„. Indeed, they signify that 
this function must be a multilinear form of JCi, ..., x„ that is skew- 
symmetric in every pair of arguments. As a numerical quantity it 
must be an axial invariant. 

But t hese very same properties are possessed by the multilinear 

form -y/l ^\D{x .. x„) given in Subsection 3. On the other hand, 

by what was described in Section 8 of Chapter V, any other multi¬ 
linear form having the same properties is proportional to the form 
V|A|D(x .. x„). Thus, if the volume of an oriented parallele¬ 

piped constructed on the vectors xi, ..., Xn is denoted by 
V(Xi . Xn), then 

V {Xi, Xn) = C^/\^\D{x . Xn) (5) 

where C is any invariant constant (different from zero, natu¬ 
rally). 


7. We can change the constant C and the form a(i, x), the de¬ 
terminant A of which enters into (5). However, the factor cVl A| 
in the right member of (5) will be uniquely defined if we designate 
at pleasure a parallelepiped with unit volume, that is, if we ar¬ 
bitrarily take the linearly independent vectors a\, ... , a„ and re¬ 
quire that V(ai, ..., a„) = 1. Thus defining CVlA|, we get the 
formula 


V(x:, 


D{ai, 


Xn) 

On) 


Thus, the measurement of volumes in affine space is uniquely de¬ 
termined by an arbitrary choice of the unit of volume. A motivated 
choice of the unit of volume is naturally done in linear spaces 
equipped with a metric (see Chapter VIII in this connection). 


8. If when changing bases we do not use the entire group of 
nonsingular matrices but confine ourselves to the unimodular sub¬ 
group G, then there will be no need for the quadratic ferin a(x,x). 
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since, relative to G, the quantity D (X|, Xn) is itself an inva¬ 
riant. It is also necessary in that case to confine oneself to bases 
of a certain class ^ relative to the unimodular subgroup G. Sup¬ 
pose that the class S has been chosen. Then putting 

V (X, . Xn) = CD(Xt, .... Xn) (6) 

we get the volume as an invariant with respect to the unimodular 
subgroup. Since D(ei, .... e„)= 1, it follows that C = V^o, where 
I'o = V{e\, ..., e„) is the volume of the parallelepiped constructed 
on the basis vectors ei, ..., e„. Also note that in (6) we can take 
C = 1. Then V(e\, ..., e,,) = 1, which means that a parallelepiped 

constructed on the basis vectors . .. has unit volume. In this 

case, all the bases of the chosen class S’ have the feature that for 
them K = 4- 1- 

9. In exactly the same way, the auxiliary quadratic form a{x, x) 
is not needed if a class of bases Sie) is considered relative to the 
subgroup of matrices with unit modulus of the determinant. In 
that case, the volume is also expressed by (6) but is an axial in¬ 
variant. As above, we assume that C = 1. Then a parallelepiped 
constructed on the vectors of basis e will again have the volume 
y = + 1, and all bases of the class S will have the characteristic 
that for them 11^| = 1, and K = -f 1 or V = — 1 depending on 
the orientation of an arbitrary basis of the class S(e} with res¬ 
pect to the original basis e. 

In order to stress the dependence of the sign of the volume on 
the orientation, one often uses the term “oriented volume of a pa¬ 
rallelepiped”. 
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§ 1. Generalities 

1. Definition. A mapping (map) y — Ax, x ^ L, y ^ L, of the 
linear space L into itself (or onto itself) is said to be linear if 

A (cur, + ^x^) = aA (a;,) + M {x^) ( I ) 

for any vectors Xi, x^^L and any scalars a, p. Here and hence¬ 
forth numerical factors are real or complex depending on whether 
the space L is real or complex. 

The mapping y — Ax is also called a linear transformation of 
the space L. We sometimes say that A(x) is a linear operator 
in L. 

Linear transformations represent a multidimensional generali¬ 
zation of a linear function of a single numerical argument 
f{x)= kx. Their diversity grows with increasing dimensionality. 

In the notation of linear transformations the parentheses are 
ordinarily dropped, and in place of A{x) we write Ax. 

The simplest instances of linear transformations are: the identity 
transformation 

Ex^x (2) 

and the zero transformation 

Qx — Q 

where 0 is the symbol of a linear transformation that associates 
to every vector x a zero vector. 

2. The product of any two linear transformations A and B is li¬ 
near: 

AB {aXi -|- p.'To) A {olBx\ -f- p/J.Vj) = aABXi -|- ^ABx2 

3. For linear transformations we define the operations of addi¬ 
tion and multiplication by scalars: 

(A + B)x = Ax + Bx, (aA) x — aAx 


(3) 
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It may readily be shown that the transformations A B and oA 
are also linear and that the set of linear transformations of a 
space L is itself a linear space. The role of the zero vector in the 
space of linear transformations is played by the zero transforma¬ 
tion 0. We leave the proof of these assertions to the reader. 

4. The existence of three operations—multiplication of linear 
transformations, addition, and multiplication by scalars—makes 
it possible to construct polynomials of transformations: 

p (/4) = Oo/f" + 0)^4" -j-(4) 

where a, are scalars; the powers of a transformation are defined 
by successive multiplication: = AA, A^ = AAA, and so forth. 

For any transformation A it is assumed, by definition, that 

A^ = E (5) 

so that the term anE in the polynomial (4) plays the part of the 
constant term. 

5. Assuming a space to be finite-dimensional, let us introduce 
the basis ei, ..., e„. 

Suppose that we know the images of the basis vectors Aen rela¬ 
tive to the given basis, that is, the coefficients of the expansions: 

Ae^ = 2 Alea (6) 

Then we know the matrix of the quantities At- It is not by accident 

that the indices (one upper, one lower) are set the way they are. 
Below we will show that A is a tensor of that order. Put 


A\ 

Al 

a] 

A] 

.. Ar 

.. A^ 

A'n 


.. a:\ 


The symbol * is employed because in the great majority of ap¬ 
plications of linear transformations it is not this matrix but its 
transpose that occurs and so the unadorned symbol A is left for 
the transpose. 

We now show how, if we know matrix A, we can compute y 
for any jc: 

i/ = Z = Av = A (S x^ek) 

We take advantage of the linearity of the transformation A: 
y==Y, x'‘Aen = Yu x'‘Alea 
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Changing the designation of one of the indices, we get 

1/ = Z y<^i = ^ (Z ei 

whence 

n 

«/' = Z <■ = 1. n (7) 

A=l 

This coordinate (component) notation is equivalent to a single 
matrix equation: 

U^Ak ( 8 ) 

where 


a\ a\ . 
a] a] . 

. A\ 

. A\ 

, x = 

.v' 

. y = 

y' 

A1 Al . 

. A] 


x" 


y" 


It is easy to verify that different matrices A and B specify 
distinct linear transformations, relative to a given basis. 

Thus, the linear transformation y = Ax o\ vectors of the space L 
is expressed in the form of a linear transformation of the va¬ 
riables (7), which transformation is given in matrix form by the 
very same equation y — Ax. It is termed the coordinate (compo¬ 
nent) representation of the linear transformation A. 

6. Using formulas (3) and (7), it is easy to verify that in the 
addition of linear transformations their matrices are added, and in 
the multiplication of a linear transformation by a scalar, the 
matrix is multiplied by that scalar so that the space of linear trans¬ 
formations of an n-dimensional linear space L is isomorphic to 
the space of n X « matrices. As was done in Section 2 of Chap¬ 
ter II, we can show that when two linear transformations are mul¬ 
tiplied together, so are their matrices. To the identity transforma¬ 
tion corresponds the unit matrix E, to the zero transformation cor¬ 
responds a matrix consisting of zeros. By the foregoing, equations 
(2)-(5) may be regarded with equal right as expressions of trans¬ 
formations or as expressions of matrices. 

7. It is appropriate now to indicate some generalizations of the 
notions introduced in this section. Given two linear spaces L 
and L'. A linear map of space L into L' or a linear operator from L 
to L' is a function y = Ax which associates to every vector xe. L 
a vector y in L' and satisfies the condition of linearity (1). For 
U = t we get the linear transformation defined in Subsection 1. 
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For linear operators we define the operations of addition and mul¬ 
tiplication by a scalar in accordance with (3). It can be shown 
that the set of all linear mappings of L into L' forms a linear 
space. 

If each of the spaces L and L' is viewed as a group under the 
addition of vectors, then any linear operator L-^L' is a homo¬ 
morphism. If L and L' are finite-dimensional and bases have been 
chosen in them, then the linear operator L^L' is given by a 
matrix and is expressed as a linear transformation of the vector 
components, but, in contrast to Subsection 4, the matrix is in ge¬ 
neral rectangular. When the dimensions of L and L' coincide, the 
operator A has a square matrix. 

Given three linear spaces L, U, L". We consider two linear 
mappings: 

(1) y = Bx, where x ^ L, y ^ L'; 

(2) z = Ay, where y ^ L', L". 

The product AB of the operator A by the operator B is defined by 
the formula 

z = ABx = A (Bx) 

and maps L into L". The linearity of AB is proved as in Subsec¬ 
tion 1. 

§ 2. A linear transformation as a tensor 

1. We assume the space L to be n-dimensional. We consider the 
linear transformation y = Ax of the space L. It is defined inva- 
riantly, that is, independently of any bases whatsoever. 

We are now interested in the tensorial nature of a transforma¬ 
tion. In L we pass to a new basis ey, ..., Then 

Ack' = Z 

1* f 

How is Ak’ expressed in terms of A*? It is not hard to figure out 
that what we have is the tensor law of transformation correspond¬ 
ing to the arrangement of the indices. This can be established 
without any calculations whatsoever. Indeed, the collection of all 
vectors x in L coincides with the set of all possible first-order 
contravariant tensors. The contraction Z for all x ^ L, 

yields a first-order contravariant tensor y\ to which corresponds 
one very definite vector i/ = Z'/*^/. irrespective of the basis e,-. 
From this, on the basis of a familiar criterion (Chapter V, Sec¬ 
tion 4, Subsections 8, 9), we conclude that Ak is a tensor, and we 
can straightway write down the transformation law of its compo¬ 
nents: 

Ak'=1, A!kPl'Q‘i' 

i.k 


( 1 ) 
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Thus, to every linear transformation is invariantly associated a 
tensor 

A = Yj 

in T\ = L ^ L*, where e\ e" e/.* is a basis reciprocal to 

^ I > • • • > • 

The converse is also true: to every second-order mixed tensor we 
can associate invariantly a linear transformation, since the con¬ 
traction of tensor (2) with the vector x = x'ei + ... A- yields 
a contravariant vector: 

y = t/'^i + . ■. + I/' = E 

which is independent of the choice of basis ei, ..., e„. 

2. We can consider linear mappings of L into L*, L* into L, L* 
into L* and, as before, prove that they too are associated with se¬ 
cond-order tensors. 

We now show how to set the indices with the knowledge that 
the transformation is a tensor. 

For example, let u = Ax, where x e L, u ^ L*. We pass to the 
component notation. The vector x is contravariant and so its com¬ 
ponents are indicated by superscripts: {x'*}. The letter A must have 
a lower index k so that it will be possible to effect a contraction 
on k. The vector u is covariant and so its components are indi¬ 
cated by subscripts: {u,}. 

Therefore the contraction must yield a covariant vector, which 
means that the other index on A must also be a subscript: 

«i = E AikX^ 

This notation signifies that a linear mapping of the space L into 
the conjugate space L* is associated with the doubly covariant 
tensor Aih- 

Similarly, for the transformation y = Bv ol the conjugate space 
L* into L we have a component representation of the form 

i/' = E B‘'vi 

so that the appropriate tensor S't is doubly contravariant. 

3. Let i4 be a linear transformation of L. The matrix A of this 
transformation relative to the given basis Ci, ..., e„ is written 
thus: 



A\ 

.. 

. A'n 

A = 

A] 

A} .. 

. Al 


A'l 

A'l . 

. A'l 
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Passing to a new basis ei', ..e„’, we get a new matrix A', whose 
elements are expressed by (1). Now let us write (1) in matrix no¬ 
tation. It is Iiest to write out the matrices in full so as not to make 
a mistake in the order when multiplying them. 

The rows of matrix P are expanded in terms of the upper index: 



p\' 

Pi' . 

.. Pr 


Pi' 

Pl> . 

. P"' 


P'n' 

Pn' . 

.. Pn' 


The rows of matrix Q are expanded in terms of the lower index: 



Ql' 

Q]' . 

.. Qn 

Q = 

Qi' 

Qi' . 

. Qn 


Ql' 

Qf . 

.. Qn 


Formula (1) can be rewritten thus: 

i. k 

whence we obtain the desired matrix expression 

A' = QAP' 

or 

= (la) 


where, as usual, we have 

Q = (Pr' 

4. Very important corollaries follow from these matrix formu¬ 
las. 

Since Q is a nonsingular matrix, it follows from (la) that 
rank/I'= rank/I. Thus, the rank of A is an invariant. Further¬ 
more, another invariant is the determinant of the linear transfor¬ 
mation, since 

dot A' = det Q det /I (det Q)“ ’ = det /I 
Also an invariant is the complete contraction of the tensor A*: 

X = A1 + A? + ... -f A" 

U 

which is the trace of the matrix of the linear transformation. 
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Note that when we speak of the “determinant of a matrix” or 
the “trace of a matrix” without indicating the object associated 
with the matrix, the question of invariance is not clear. For 
example, neither the determinant nor the trace of the matrix of a 
bilinear form is an invariant. 

5. Let A be a linear transformation of a space and let the same 
symbol A denote the matrix of this transformation relative to an 
arbitrarily chosen basis. The foregoing subsection enables us to 
give the following definitions. 

(1) The rank of matrix A is called the rank of the transforma¬ 
tion A. 

(2) The determinant of matrix A is called the determinant of the 
transformation A. 

(3) The trace of matrix A is called the trace of the transforma¬ 
tion A. 

The geometrical meaning of rank and determinant of a trans¬ 
formation is considered in the next section. 

§ 3. The geometrical meaning of the rank and determinant 
of a linear transformation. The group of nonsingular 
linear transformations 

1. Given in an n-dimensional vector space L a linear transfor¬ 
mation A. Suppose that rank A = r. 

Denote the image of L by J? or by A{L), that is, the set of ele¬ 
ments y of the form y = Ax, where x ranges over the whole of L. 

Theorem I. The set J( = A{L) is a linear subspace of dimen¬ 
sion r in L. 

Proof. We have 

y = Ax = Ai'Z x'^Ck) = Z x'‘Aek 

Hence, J[ = A (L) is a linear hull of the vectors Aei ./la„; but, 

as we know, the linear hull of a given system of vectors is a sub¬ 
space whose dimension is equal to the rank of the system of vec¬ 
tors. The components of the vectors Ae\ . Ae„ form the rows of 

matrix A so that the dimension of A(L) is equal to the rank of A. 
The theorem is proved. 

2. Denote by Jf the total inverse image of the zero vector 0 un¬ 
der the transformation A, that is, the set of all vectors x of space L 
for which Ax = 0. The set AT is called the null space of the trans¬ 
formation A. 

Theorem 2. If rank A = r, then the null space JT of the trans¬ 
formation A is a subspace of dimension n — r in the space L, 
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Proof. if and only if Ax = Q. Writing this vector equa¬ 

tion in components in terms of an arbitrary basis ci, .... e„, we 
get a system of homogeneous linear equations whose rank is r: 


A]X ... -j- A,ix'' == 0, 

Alx'A- ... +/1"a:'' = 0 


( 1 ) 


According to Section 5, Chapter III, the set of vectors whose 
components satisfy (1) is a subspace of dimension n — r, which 
is what we wanted to prove. 

3. Theorems 1 and 2 permit us to give two geometric definitions 
of the rank of a transformation that are equivalent to the original 
algebraic definition (Section 2, Subsection 5). 

(1) The rank of a linear transformation is equal to the dimen¬ 
sion of the image of the entire space L. 

(2) The rank of a linear transformation is equal to the difference 
between the dimension of the space and the dimension of the null 
space of the transformation (that is, of the complete inverse image 
of the zero vector). 

4. Let r < rt. Consider the action of the transformation A from 
the geometric point of view. 

Here it will be convenient not to distinguish between a linear 
space L and the corresponding affine space 21 and to identify every 
point of 21 with its radius vector. 

Let us consider the nonhomogeneous system of equations 

Z A]x‘ =«/', / = (2) 


where /4 = |/l/|i is the matrix of the linear transformation under 
consideration. This system is solvable if and only if the vector 
y = y'c’iA-y’'^n belongs to the subspace Jl = A(L). For 
every y ^ Jt the solution set of system (2) forms a plane of di¬ 
mensions n — r parallel to the subspace Jf (in this connection, see 
Sections 6, 7 of Chapter III). It is evident that every point of the 
space belongs to one such plane. 

Thus, the entire space splits into parallel planes of dimension 
n — r, each of which is mapped into a single point of the sub- 
space Jl. 

5. Definition. When rank A — n, the transformation A becomes 
nonsingular. 
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We can give other equivalent conditions for nonsingularity: 

(1) det^:?^ 0, 

(2) Jl = A{L) = L, 

(3) jr =Q. 

Every element of the space L then lias a unique inverse image. 
This can be verified directly by solving (2) by Cramer’s rule. De¬ 
noting by Ak the elements of the inverse matrix A-\ we get 

/ = S aW 

or, in symbolic form, 

x = A~'y (3) 

The transformation (3) is a linear transformation inverse to the 
given one. 

6. Theorem 3. The set of all nonsingular linear transformations 
forms a group of transformations of the space L. 

Proof. From the theorem on the rank of a product of matrices 
(Chapter 11, Section 4) it follows that the transformation AB is 
nonsingular if A and B are nonsingular. Furthermore, det/l-' = 
= (det/t)“‘ =#= 0 and so the inverse transformation A-^ is nonsin¬ 
gular. Thus, the set of nonsingular linear transformations of L 
satisfies the definition of a group of transformations (see Section 2, 
Chapter VI). 

7. Theorem 4. In n-dimensionallinear space, a nonsingular li¬ 
near transformation A is determined uniquely if we have given 

an arbitrary system of n independent vectors X\ . x„ as inverse 

images and an arbitrary system of n independent vectors yi .r/„ 

as images. 

Proof. We take the vectors X\ . x„ for a basis in L and ex¬ 

pand the vectors yu ..., yn in terms of this basis: 

= ••• (4) 

Matrix A of the desired transformation is uniquely defined rela¬ 
tive to the basis X[, ..., x,, by specification of the vectors (4), since 
its columns are formed by the components of these vectors (.<4* = 
= ai, see Subsection 5 of Section 1 above), det A ¥= 0 since the 
vectors (4) are independent. The proof of Theorem 4 is complete. 

8. Let us determine the geometric meaning of the determinant of 
a linear transformation A. 

To do so, we make use of the notion of the volume of a paralle¬ 
lepiped (see Section 5, Chapter VI). We define the class of bases 
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with respect to tlie subgroup of matrices with determinant modulus 
unity and take one of the bases ei, ..., of that class. Denote 
by Vo the oriented volume of a parallelepiped constructed on the 
vectors ei, ..., e„ and compute the oriented volume V of the pa¬ 
rallelepiped constructed on the vectors Aei . Acn- Taking into 

account that the components of the vectors Aei form the columns 
of matrix A, we get, by (6) of Section 5, Chapter VI, 

V = Vudet/t 

Hence, under the given linear transformation, all volumes 
change the same number of times, and the determinant of the 
transformation is the coefficient of this change. In the case of a 
nonsingular transformation, we get V=V=0, and the bases ei, ..., en 

and Aei . Aen have the same orientation if det /I > 0 and the 

opposite orientation if det/I < 0. If a transformation is singular, 
then det /t = 0, the vectors Aei, Ae„ are linearly dependent, 
V = 0. Also observe that (5) can be derived directly from the theo¬ 
rem on the determinant of a product of matrices. 

§ 4. Invariant subspaces 

1. Definition. A subspace L' cz L is said to be an invariant sub¬ 

space of the transformation y = Ax if Ax ^ L' for every x e L'. 
(In symbols we can write A (£')'= ) 

Examples of invariant subspaces are the subspaces J[ and JC 
introduced in Section 3. We now prove this. 

(1) Ax ^ JC \ov any vector x e L, in particular for any x^ Jl, 
and so A {JC) a Jt. 

(2) U X ^Jf, then Ax = 0 e so that A (Jf) e Jf. 

The zero subspace (which consists of the single vector 0) is in¬ 
variant under any linear transformation A since A0 = 0. 

2. Let U be a subspace invariant with respect to A. Then the 
transformation A does not carry vectors belonging to L' outside L'. 
Thus is defined in L' the linear transformation 

i/ = Ar, xeL', y^L' (1) 

We will say that the transformation A specified in L induces the 
transformation (1) in the invariant subspace L'. At times it is con¬ 
venient to denote the induced transformation by a different symbol 
than A. say A'. Then A'x = Ax if xe L', A'x is not defined if x 
does not belong to L'. 

If the transformation A is nonsingular, then the induced trans¬ 
formation A' is nonsingular as well, and for that reason A(L') = 
= A'{L')=L'. This is clear since otherwise there would be a 
vector X ^ L' cz L, X ^ Q, for which Ax = 9, 
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3. If the subspace L is not invariant with respect to A, then 
there is a vector L for which Ax does not belong to L. For 
this reason, A does not induce any transformation in the sub¬ 
space L. 

4. We now show that if the invariant subspace L' is known, 
then it is possible to simplify the transformation matrix by placing 
several basis vectors in L'. 

Let the vectors ei, ..., e/, e U. Then their images belong to U 
and can be expanded in terms of the same vectors: 

Ae\ = A\e\ + ... Alck 


Ack = Ak6i -f- ... -1- AkCk 

In general, these are followed by longer expansions: 

Aek+\ = Ak+\ei ... -\-Al+\ek-{- A\X\sk+\-\- ... 


Aen = Ahei -j- .. • -F AnSk + AX^^et+i AnCn 

Thus, in the case at hand, the transformation matrix (which is 
the transpose of the matrix of coefficients of the expansions that 
have been written out) is given as follows: 

i4| ... A\a\+\ ... A\ 


idf ... A\Ak+\ ... Al^i 
- Alt\ ... AX"' 

0 . 

— Ai+i ... 

5. Let us consider an important special case where the space L 
is the direct sum of two nonzero invariant subspaces L' and L": 

L = L'®L", A(L')czL', A(L")czL" 

Choose bases in L' and L": 

6|, . . . , Cfi S Z. , \-h ...» ^ L 

Then by Theorem 4, Section 14, Chapter I, the vectors ej. e„ 

form a basis of the space L. Relative to such a basis, the computa¬ 
tions of the preceding subsection are applicable to Ci, ..., and 
to Ch+i, ..., fin as well. Therefore the matrix A decomposes into 
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Iwo autonomous “boxes” 


kxk 

0 

0 

{n — k)X{n — k) 


These “boxes” are matrices of linear transformations induced on L' 
and L". 

Thus, the study of the transformation of the space as a whole 
reduces to the study of its operations in L' and L”. 

6. In Section 10, below, we make use of the following lemma. 
Lemma. If L = L' @ L” and the subspaces L', L” are invariant 
under A, then A (L) = A {L') 0 A (L"). 

Proof. If L is the sum of L' and L" (though not necessarily the 
direct sum), then, as is readily verifiable, A {L) = A {L') + A {L"). 
On the, other hand, because of the invariance of L' and L” we 
have 

A{L')c^L', A(L")c^L" 

whence 

A(L')nA(L")czL'f\L" 

but the sum of L' and L" is a direct sum, and so U P[ L" = 
Hence, A {L') f) A (L")= 0 and, consequently, A {L')-\-A (L'') = 
= A (U) ^ A(L"), which completes the proof. 

§ 5. Examples of linear transformations 

Preliminary remark. When examining examples of transforma¬ 
tions, it is convenient not to distinguish between the linear space L 
and the corresponding affine space (as is done in Subsection 4 
of Section 3). 

1. Similarity transformation. The space L has any dimension. 
The transformation is given by 


Ax = Xx 


for any x and a fixed X called the coefficient of similarity (expan¬ 
sion factor or contraction factor). All vectors are “stretched” the 
same number of times (for |A.|< 1 they contract). In this case 
every subspaee is invariant. The matrix of a similarity transforma- 
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tion of an n-dimensional space, given an arbitrary choice of basis, 
has the form 

% 0 

A=XE= *. : det i4 = r 

0 % 

The identity transformation E may be regarded as a similarity 
transformation with coefficient unity, and the zero transforma¬ 
tion 0, as a similarity transformation with coefficient zero. In the 
space of all linear transformations given on L, similarity trans¬ 
formations A = XE form a one-dimensional subspace (a straight 
line passing through the points 0 and E). 

2. n = 3. Let x = {a:', x^, be an arbitrary vector and y = 

= {t/'> t/®} its image. We give the transformation y = Ax hy 

( y'=x'+x^ + x\ 

] y'^ = x'-\-x^ + :^, ( 1 ) 

I i/3 = 2x'-fx2-jc3 

It is clear that the transformation is singular and rank A = 2. The 
image of the entire space is the subspace ^ = A (L) given by the 
equation 

y'^=y' 

Let us find the total inverse image of an arbitrary point = a, 
y2 = a, y^ = b of the plane J( = A{L) (Fig. 29). System (1) is 
consistent for the indicated values of y\ y^, y^. The first equation 
may be dropped, leaving two equations defining a straight line: 

x' x^ + x^ = a, 

2x' -f = 6 

which we denote by 

The straight line thus found is its desired inverse image. For 
different a and b, all such straight lines are parallel and cover the 
entire space. In Fig. 29, denotes the straight line which is the 
complete inverse image of 0. Points not in the plane A{L) do not 
have inverse images. 

3. n = 2. We give the transformation y = Axhy 

( y' = kx', 

X y^= 
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The transformation matrix A — 


k 0 
0 1 


Let us take an arbitrary 


point M and its image M'. The line segment MM' is parallel to 
the x'-axis. Extend this segment to intersection with the x*-axis at 
point K (Fig. 30). Then for any choice of M we have 


KM 


If 1^1 < I, then KM contracts. If fe < 0, then the points Af 
and M' lie in different half-planes x' > 0 and .x' < 0 (Fig. 31). 



This kind of transformation is called a compression along the 
x'-axis to the x*-axis. When |^1> 1 we speak of a stretching. 


4. The transformation 


with matrix A = 
the x'-axis (for \k 


( iy=x' 

\ y'^= kx^ 

1 0 II 

Q ^ is a compression along the x*-axis to 
> I we actually have a stretching, see Fig. 32). 


6. The transformation 


t2) 


y^= 
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= A 1 A 2 — AiAi. Therefore, a transformation with matrix A may 
be obtained by compounding a compression along the jc'-axis to 
the x*-axis and a compression along the x^-axis to the x’-axis in 
any order. Transformations of this type are frequently encountered 
in the theory of elasticity. 



It is easy to verify that in each of the examples of Subsections 
3-5, the axes jc' and are invariant subspaces. 

6. Let L = L' @ L". The dimension of L is arbitrary and the 
subspaces L' and L" are not zero. Let U and L" be invariant sub¬ 
spaces of the transformation A which induces in L' an identity 
transformation and in L" a similarity transformation with coeffi¬ 
cient X. This kind of transformation is called a compression with 
coefficient X to the subspace U in the direction of the subspace L". 

Representing x in the form x = x'-i-x" (x'^ L', x" ^ L") we 
get 

Ax = x' + ;i,x" 

Let the vectors ei, ..., Ch form a basis in L', and the vectors 
, e„ a basis in L". Then, relative to the basis ei, ..., e„, 
the transformation matrix A is of the form 


I 


A = 


1 


0 


k 


0 


K 
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Note that the linear hull .... et^) of any subsystem of the 

basis g], ..., is an invariant subspacc. 

7. Putting A, = 0 in the preceding example, we get a transfor¬ 
mation that is called the projection of space L onto the subspace L' 
in the direction of the subspace L". The projection may be deter¬ 
mined directly as follows. If x is any vector of L, then it can be 
uniquely expressed as x = x' + x", where x' e L', x" e L". Then 
the projection of x on L' in the direction of L" is Ax = x'. The 
projection is a degenerate transformation; L' is the image of the 
whole of L, L" is the complete inverse image of the zero vector. 


8 . As an example, we now give a construction that will be im¬ 
portant for what follows. Fix a basis ej, ..., e„ in the space L. 
Relative to this basis, the transformation, which we denote by 
G„{X) or, more compactly, G, will be given by the formulas 


G^i = Kci, 

G^2 

0^3 = 62 + ^.63, 

G6„ = e„_i + A6„ 


(3) 


The matrix of this transformation with respect to the basis 
61 . 6 „ is called an n-dimensional Jordan submatrix correspond¬ 

ing to the number K: 


K 1 
k 


G„(^) = 


0 


0 


K 1 
K 


The Jordan submatrix has X on the main diagonal, ones on the 
diagonal above the main diagonal, and zeros elsewhere. 

Subspaces of the form L{eu ..., e,,), k < n, are invariant; in 
each one of them the transformation is given by the Jordan sub¬ 
matrix Gh(X) of dimension k. 

It is obvious that the transformation G„(X) is degenerate if and 
only if X = 0. In that case, JC — G {L)= L ( 61 , ..., 6 „_,), Jf = 
= L ( 6 ,). 
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Let US consider in more detail the three-dimensional case when 
X = 0. The transformation A = C3(0) is given by the formula 


y' 


0 1 0 


x' 


= 

0 0 1 


x2 



0 0 0 


x3 


= x^, y- = y^ — 0 


Every straight line parallel to the jc'-axis and intersecting the 
plane x' = 0 at the point (0, a, b) is carried into a point with co¬ 
ordinates (a, b, 0) located in the plane x^ = 0. If the axes x',x^,x^ 



Fig. 34 


are mutually perpendicular, then we can assume that at first all 
the space is projected on the plane (a:^ x^), and then this plane is 
imposed on the plane (x', a:*) so that the positive x^-axis merges 
with the positive x'-axis, and the positive x^-axis with the positive 
x^-axis (Fig. 34). 

§ 6. Eigenvectors and the characteristic polynomial of a trans¬ 
formation 

1. Definition. An eigenvector of a given linear transformation A 
is any nonzero vector x that satisfies the condition 

Ax = Xx (1) 

wliere X is a scalar. 

'the number X is called the eigenvalue of the transformation A 
that corresponds to the given eigenvector x, 
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For brevity, one says “I is the eigenvalue of the given eigen¬ 
vector”. An eigenvector goes into a vector collinear with it. In 
real space, the eigenvalue shows how many times the eigenvector 
is “stretched” (or, when |X|< 1, “compressed”). 

It is easy to see that if x is an eigenvector, then ax is also an 
eigenvector for any a # 0 and that the linear hull of every eigen¬ 
vector constitutes an invariant one-dimensional subspace (an in¬ 
variant straight line). 

2. In many problems of algebra and its applications, one is cal¬ 
led upon to find all the eigenvectors of a given linear transfor¬ 
mation. We now investigate this problem. 

We consider a linear transformation y = Ax and also the iden¬ 
tity transformation E. We have Ex = x for all jc e L. Therefore, 
condition (1), under which x is an eigenvector of the given trans¬ 
formation, can be written as 

(.4-A,£)je = e (2) 

Let the transformation y = Axht represented relative to a basis 
Bu ..., Bn by the formulas 

/ = k=l,...,n (3) 

Since the unit matrix £ = 16/1|, it follows that because of (3) the 
relation (2) is equivalent to the following system of homogeneous 
equations: 

Z(A/-?.d/)x' = 0. k=l . n (4) 

where x', ..., x" are components of the eigenvector x relative to 
the basis ei, ..., b„ and I is the eigenvalue of x. 

Definition. The matrix A — XE of system (4) is called the charac- 
tBristic matrix of the given transformation A, its determinant 

A\-X A] ... Ai 
p {X) = det (A — XE) — A\ A 2 — X... An 


I A? A? ... An-X 

is called the charactBristic dBtBrminant of the transformation A. 

Obviously, p{X) is a polynomial of degree n in X. It is called 
the charactBristic polynomial of matrix A (or transformation A). 

The general plan for solving problems involving eigenvectors 
now reduces to the following. First of all, the so-called characte¬ 
ristic equation 

p(X) = 0 (5) 

8-661 
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is formed. Equation (5) is necessary and sufficient for system (4) 
to have nontrivial solutions. Therefore, in complex space, the roots 
of (5), and only these roots, are the eigenvalues of the transfor¬ 
mation A. In real space, the eigenvalues are all the real roots, and 
only these roots. Suppose that all the roots A,i, ..., have been 
found. For the sake of definiteness, we will assume to be dealing 
with real space. Then we reject all complex roots and run through 
the remaining ones. This system will each time receive definite 
numerical coefficients. 

The rank of the resulting system will be a number r, r < n, so 
that the system will have n — r independent solutions. In finding 
them, we thus find the n — r independent eigenvectors with one 
and the same eigenvalue, which is equal to the root taken. Their 
linear hull, with the zero vector eliminated from it, yields all the 
eigenvectors with the same eigenvalue. This follows from the theo¬ 
rem on the solution set of a homogeneous linear system of equa¬ 
tions. 

Having thus gone through all the real roots of the characteristic 
equation, we are able to find all the eigenvectors of the given 
transformation. 

In the case of complex space, we have to go through all the roots 

^l> • • • » Xn- 

3. Examples. (1) A similarity transformation is a transformation 
for which all nonzero vectors are eigenvectors with the same eigen¬ 
value equal to the coefficient of similarity. 

(2) The transformation G„(Xo) (see Section 5, Subsection 8) 
has only one linearly independent eigenvector. Indeed, for G„(A^)) 
the characteristic polynomial p(k) = {ko — X)" has the sole root 
X = A .0 of multiplicity n. For X = Xo the characteristic matrix 
G„(Xo) —koE is of rank n — 1 (the nonzero minor of order n — 1 
is obtained by crossing out the left column and lowest row). For 
this reason, a system of type (4) made up for the transformation 
G„(Xo) has, for X = Xo, only one linearly independent solution. 
From formulas (3), Section 5, it is evident that the vector ei is an 
eigenvector. 

(3) The transformation of a two-dimensional real plane 

l/'= x‘ + 2x2, I 
i/2== —x‘-f J 

does not have eigenvectors since P(X)=X* — 2X-f3 does not 
have any real roots. 

We leave it to the reader to find the eigenvalues and the eigen¬ 
vectors in the other examples of Section 5. 
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§ 7. Basic theorems on the characteristic polynomiai and eigen¬ 
vectors 

I. Theorem I. The rank of a characteristic matrix is, for fixed 
an invariant with respect to a change of basis. 

Proof. The theorem is a consequenco of the general proposition 
of the invariance of the rank of the matrix of a linear transforma¬ 
tion since the characteristic matrix is the transformation matrix 
A —KE. 


2. Theorem 2. The characteristic polynomial p{K) is invariant 
with respect to the transformation of the basis. 

Proof. Let A and A' be matrices of the given transformation re¬ 
lative to the bases ei, ..., e„ and e'l, .... e'„, P the matrix for 
changing from the first basis to the second, and Q=(P*)-‘. By 
Section 2 we have 

A'-kE = A'-XE' = Q{A-KE)Q~' (1) 

(£' = E since the identity transformation relative to any basis 
has a unit matrix). From (1) we get 

det l,A' - %E) = det Q det - KE) det Q" ‘ = det (/I - KE) 

Remark. Let us write the characteristic polynomial as 

p(>,) = (_l)'*[r-p,r-'-l-p2r-'- ... +{-\Tpn'\ 

It is readily verified that p\ is the trace of the matrix A, p„ = 
= det A. From Theorem 2 follows the invariance of all coefficients 
p{K), in particular pi and p„. We have thus obtained another 
proof of the invariance of the determinant and of the trace of the 
transformation matrix. 

3. Theorem 3. // L = L\ (£> Lz, L\ and are nonzero subspaces 
invariant under A, then p(K)= Pi(K)p 2 (K), where pi(X), P 2 (K) are 
characteristic polynomials of the transformations induced in Li 
and L 2 . 

Proof. Let et, .... be a basis in Lu Ck+i, ...,€„ a basis in £ 2 - 
By Section 4, the matrix A —KE relative to the basis e;, ..., 
of the space L has the form 



flu K 

0(2 

• • • 



021 

CI 22 - ^ 

... 02* 

0 

II 

<< 

1 

flfti 


• • • Oitk — K 

' 



0 

^ft-Uft+l K ... 0*^.| n 

ft-l-l • • • ^nn ^ 


8* 
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whence 

dot (A - m = 

X 


a,I — 

k 

ai2 


^21 


Q 22 ^ 

O 2 * 

a*! 


0*2 

• • * 

^A+It+I 

— k ... 

0* + i n 


a 

* + I • • • 

Ojin ^ 


= Pi WP2(^) 


4. Theorem 4. If to a certain eigenvalue X there correspond m 
linear independent eigenvectors, then their linear hull L is an 
m-dimensional invariant subspace and the transformation induced 
in L is a similarity transformation with coefficient k. 

Proof. Let ei, ..., em be linearly independent eigenvectors cor¬ 
responding to the number k. Take an arbitrary vector x in L = 
= L (eu .. ., e,n), expand it relative to the basis ei, , em of the 
subspace L and apply to it the given transformation A to get 

Ax = A(x'ei-\- ... x"'e^ = x'Ae\-\- ... -j-x"'Ae„ 

— x'ke, -f ... -f x'"ke„ = kx = L 


whence follows Theorem 4. 


5. In applications of linear algebra an important part is played 
by the question of simplifying the matrix of a linear transforma¬ 
tion by choosing a suitable basis. 

Theorem 5. A transformation matrix A is diagonal if and only 
if the basis consists of eigenvectors. 

Proof. (1) If the basis consists of eigenvectors of the 

transformation A, then 

Aci = kie\, 

Ae2 — k‘^2'* 


A"n 



where kt, X 2 . k„ are eigenvalues (in general, distinct). The 

coefficients of the right members of (2) form the matrix A*, which 
In this case coincides with A: 


A* = 


A,| 

0 



0 


= A 


K 


( 3 ) 
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Relative to this basis, the transformation y = Ax has the coor¬ 
dinate (component) representation 

!/i =K'(u 
Ih ~ 

IJn = 



(2) Given, relative to a basis . .. e„, a matrix A in diagonal 

form: 


Then A* = A, 




0 


Aei = AiSi, ' 

AB'I = 

ASfi = AnSfi A 


and this means that all the basis vectors are eigenvectors with 
‘ki = A\. The theorem is proved. 

Remark. The transformation (4) may be represented as a pro¬ 
duct of n compressions; first a compression with coefficient Xi to 
the subspace L^e^, ..., e„) in the direction of L{e\), then a com¬ 
pression with coefficient ki to the subspace L{e\, 63 , , e„) in the 

direction of L (^ 2 ), and so on. It is easy to verify that the compres¬ 
sions can be carried out in any order. (When |Xi|> 1 we have a 
stretching.) 


6. The examples in Subsection 3 of the preceding section show 
that there may not be a basis of eigenvectors, in which case the 
transformation matrix cannot be reduced to diagonal form. Just 
how the transformation matrix may be simplified in that case is 
considered below in Sections 9, 10. 


§ 8. Nilpotent transformations. 

The general structure of singular transformations 

1. In this section we consider singular linear transformations in 
n-dimensional space L (it is immaterial whether the space be real 
or complex). 
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2. Definition. A transformation B is said to be nilpotent if 
Bi' = H for any positive power p. 

In otlier words, 

B'’x = Q for any atsL (1) 

The smallest (natural) number p for which (1) holds true is 
called the height of the nilpotent transformation. 

Remark. If Bex = Q for a number p and a vector JceL, then 
for X and any integer m > p we have 

B'"x = B'"'" (B^x) = B"'~'’Q = 0 

The simplest example of a nilpotent transformation is the zero 
transformation 0; its height is equal to unity. 

Every nilpotent transformation is singular. This is clear, for 
if Be = 0, then det (BP) = (det B)e = 0, hence det B = 0. 

However, nilpotent transformations are not merely a special case 
of singular transformations. They are the basic element in the 
structure of every singular transformation. Namely, the following 
theorem holds. 

Theorem. Let B be a singular linear transformation of the 
space L. Then 

L = Z.,©Z.2 (2) 

where L\, are invariant subspaces and 

(1) the transformation induced in L\ is nilpotent-, 

(2) if the subspace Li is not a zero subspace, then the transfor¬ 
mation induced in it is nonsingular. 

In short, we can say that B is nilpotent in Li and nonsingular 
in Li. In particular, the transformation B is nilpotent in L if the 
dimension of Z,i is equal to n{L\ = L), while the dimension of Li 
is zero (Li = 0) in this case alone. 

The proof is given below in Subsection 4 because it is based on 
the auxiliary propositions given in Subsection 3. 


3. Consider the successive powers of a given singular transfor¬ 
mation B: 


B, B\ B^ .B^ . 


Denote by Jfk the null space of the transformation B* and by Jtk 
the image of the entire space L under the transformation B*. Let Ck 
be the dimension of J(h- By Subsection 1 of Section 3, 

rft = rank(B'') 

Let us investigate the properties of sequences of the subspaces 
and JCh (k = 1 , 2 ,...). 
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(1) For any k, Jth is an invariant subspace under the transfor¬ 
mation B. 

Proof. If then there exists a y^L such that B’^y = x, 

whence we have 

Bx = B {Bhj) = 6*+ ly = fi* (By) e 

(2) We have the following inclusions: 

1'=) JllZD J1.2ZD ... ZD Jlk^ ‘^k + \ ^ • ( 3 ) 

Indeed, the inclusions (3) follow from the preceding property 
since B (*^ik) 

(3) The relations 

rt > r, > ... > rp_, >rp = rp+, = ... . (4) 

where p is natural, hold true. At the same time, 

Jlk — Jlp when k> p 

A nonsingular transformation is induced in the invariant sub¬ 
space Jtp. 

Proof. It is seen from (3) that rj ^ rj+\. Because of the singu¬ 
larity of B we have ri < n. The ranks rj are non-negative and so 
there can only be a finite number of strict inequalities in (4). Letp 
be the smallest natural number for which the equation 

rp = rp+t (5) 

holds true. If rp = 0, then clearly rk = rp = 0 and Jlh = Jtp = 0 
for k > p. Let rp ^ 1. Then from (5) and (3) it follows that B 
induces in the subspace Jtp a nonsingular transformation, that is, 
Jl p+i — P (Jtp) — Jtp, whence Jtp .\.2 — B (Jt p+i) — B (Jtp) — Jtp. 
Similarly, Jtp +3 = Jtp, ..., Jth = Jtp for any k > p. At the same 
time, Tft = fp for k> p. The proof of the third property is com¬ 
plete. 

(4) We have the following inclusions: 




(6) 


JFp if k> p 

(7) 


where p is the smallest number that satisfies the condition (5). 

Proof. The inclusions (6) are obvious, for if x ^ Jfk, then 
B'*x = 0 and = B(B'^x) = 0, so that x e AFk+i. 

By Theorem 2, Section 3, the dimension rift of subspace AFk is 
equal to n —rft. As long as the ranks r/, decrease strictly with in¬ 
creasing k, the dimensions nft increase strictly. For k'^ p, all 
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ranks r,, are the same; so also are the dimensions np,np+u 
Hi, .From this and from (6) follows (7). 

(5) For any k, JFk is an invariant subspace of the transforma¬ 
tion B. The transformation induced in JFi, is nilpotent. 

Proof. The invariance of JFh for k = i follows from the rela¬ 
tions = and for k>l from the inclusions (6), 

since cr cr The nilpotency of the transformation 

induced in is obvious since 6^ {^ft) = 0. 

4. Proof of the theorem. We will show that 

L=J^p®JTp (8) 

if p satisfies condition (5). Since B is nilpotent in JFp and nonsin- 
gular in JKp, we thus obtain the desired expansion (2) (L| = Jfp, 
I 2 = J(p). We know that the sum of the dimensions of Jfp and 
jKp is n and so to obtain (8) it suffices to verify that 

= % (9) 

(see Section 14, Chapter I, in this connection). 

Equation (9) is proved by contradiction. Let x ^ Q, x^JFpf] 
n JKp. We consider the vectors 

X, Bx, B^x = Q (10) 

They all belong to Jlp (because jKp is invariant). Denote by y the 
last of the vectors of system (10) that is different from Q{y = B^x, 
where ^ is a number such that 0 ^ k C p). Then we have 

y=?^Q, By = 0, y^Mp (11) 

But (11) contradicts the nonsingularity of the transformation B 
in the subspace J[p. Thus (9) is established and the theorem is 
proved. 

5. Remarks. (I) From Subsection 3 it is seen that the height of 

the nilpotent transformation induced in the subspace Jfp is equal 
to p (here and above, p is the smallest number satisfying the con¬ 
dition (5); as k increases, for < p, we have an extension of the 

sulispaces /Fu and a restriction of the subspaces JCh\ when p 

the subspaces Jfh and Jt* no longer vary). 

Thus, the subspace Jfp may be found as the null space of the 
transformation B'* for any p. Similarly, Jtp = B'‘(L) for 
k ^ /;. 

(2) It is likewise easy to see that for k <. p the intersections 
Jfu fl -K(h contain nonzero vectors, and so the sums Jfh + are 
not direct sums and do not exhaust the space L. 
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§ 9. The canonical basis of a nilpotent transformation 

1. Let us consider some other questions relating to nilpotent 
transformations. First of all we shall need some terminology that 
will be important for what follows. 

We say that the vectors oi, 02 . form a sequence of length 

k relative to a transformation B (which is not necessarily nilpo¬ 
tent) if these vectors arc not zero vectors and if 

Bai = a2, Ba2 = a3, .... Bak-\=ak, BUk — Q ( 1 ) 

We will say that Qi is the senior or first vector and is the 
junior or last vector of the sequence (1). If a # 0, Ba = Q, we 
will say that a forms a sequence consisting of a single vector that 
is both senior and junior at the same time. 

We say that a basis of space L is canonical relative to the trans¬ 
formation B if it consists of a single sequence or of several se¬ 
quences that do not have any vectors in common. 

2. We note the following properties of nilpotent transformations: 

(1) If in a space L there is a sequence relative to a nilpotent 
transformation B containing k vectors, then the height of the trans¬ 
formation is p k. 

Indeed, in a sequence of type (1), = Cft = 5 ^ 0, whence 

p> k—\. 

(2) If the height of the nilpotent transformation B is p, then 
there exists a sequence relative to B of length p and there are no 
longer sequences. 

Proof. By the definition of a height, in the space there is a vec¬ 
tor X such that Bp-'x =f‘= 0. Then the vectors 

X, Bx, B^x, ..., B'’~'x 

form a sequence of length p since there are no zero vectors among 
them (otherwise Bp-'x would be a zero vector) and B{Bp-'x) = 
= Bpx = 0. By the preceding property, there can be no longer 
sequence in the space. 

(3) Every sequence is linearly independent. 

Proof. Write down the following relation for an arbitrary se¬ 
quence ( 1 ): 

+ ^ 2^2 + • • • + ^* 0 * = 0 (*) 

Operate on both members of this equation with the operator 
to obtain XiOh = 0, since = a,„ = 0 for i > 1. Since 

flft # 0, we find Xi = 0. Now operating on (») with the operator 
find X 2 = 0. Continuing the process, we find that all num¬ 
bers Xft = 0- The contention is proved. 
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Corollary. If n is the dimension of the space L and p is the 
height of the nilpotcnt transformation in L, then 

p^n 

(4) If in L there is a canonical basis for a transformation B, 
then B is nilpotent and its height is equal to the number of vec¬ 
tors in the longest sequence of that basis. 

Proof. Let ei, ..., be a canonical basis and let k vectors 
enter into the longest of its sequences. Then for every basis vector 
we have: B''ej = 0. Let us take an arbitrary vector x ^ L, expand 
it in terms of the basis eu .... e,,, and apply to it the transfor¬ 
mation B'‘: 

= 4- ... -\-x'’en) = x‘B’‘et+ ... +x"B'‘e„ = Q 

This means that B is nilpotent and its height is p ^ k. On the 
other hand, by property (1) we have p ^ k. Hence p = k. 

3. Examples. (1) For the zero transformation 0, every vector 
forms a sequence of length k = \, and so every basis of the 
space L is canonical with respect to 0. It is readily seen that if B 
has height p = 1, then it is a zero transformation {B == 0). 

(2) Consider the transformation G„(0) (see Subsection 8, Sec¬ 
tion 5, for k = 0). By the definition of G„(0), there is a basis 
e,, ..., e„ consisting of a single sequence: 

On (0)e„ = e„- .G„ (0) e 2 = e,, G„ (0) e, = 0 

From this and from property (4) of Subsection 2 it follows that 
G„(0) is nilpotent and its height is p = n. 

Observe that the matrix of this transformation relative to the 
given basis eu .... is the singular n-dimensional Jordan sub- 
matrix 


0 1 

0 

0 

1 

• • 


• » 

0 1 

0 

0 


(3) Let a transformation B be given, relative to a basis 
Cl, , e„, by a matrix in which the main diagonal accommodates 
several singular Jordan submatrices of distinct dimensions, all 
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other elements being zero. We write this matrix symbolically as 

Gfc.(O) Q 

G*. (0) 

B= ■ (2) 

0 G,, (0) 

Without loss of generality, we may assume that ^ *2 ^ ^ 

since by renumbering the basis vectors we can achieve a permuta¬ 
tion of the submatrices on the diagonal of the matrix B. 

For the sake of pictorialness, we write out in full a matrix of 
type (2) with three submatrices G*^. of dimensions ki = 4, k 2 = 3 
and = 1: 

0 10 0 
0 0 10 
0 0 0 1 
0 0 0 0 

0 0 0 0 
0 0 0 0 
0 0 0 0 

0 0 0 0 

The one-dimensional submatrix Gi(0) consists of the single scalar 
zero. Within each of the submatrices G^fO) of dimension k'^2 
there is a diagonal above the main diagonal consisting entirely of 
units. The sequence of units between every pair of adjacent sub¬ 
matrices is broken by one zero. 

Thus, one diagonal of the matrix (2a), like any matrix of type 
(2), consists of a sequence of units broken by zeros. This diagonal 
lies immediately above the main diagonal of the matrix. All other 
entries of the matrix are zeros. 

Because of (2), the basis e\ .c’„ is canonical and it consists 

of / disjoint sequences with the length of the longest being ^i. 
From this it follows that B is a nilpotent transformation with 
height p = k\. The linear hull of every sequence entering into a 
basis is an invariant subspace. 

This example includes the two preceding ones as special cases. 
If *1 = 1, then all submatrices are one-dimensional, each consist¬ 
ing of the single scalar zero, and the entire matrix is a zero matrix 



0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

1 

0 

0 

0 

0 

1 

0 

0 

0 

0 

0 

0 

0 

0 

0 
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;ui(l B is a zero transformation. If / = 1, then the matrix B 
consists of one submatrix; B = G„(0). 

It is quite evident that, conversely, if for a transformation B 
there is a canonical basis and disjoint sequences of the basis are 
arranged one after the other, then the matrix of the transforma¬ 
tion B, relative to such a basis, is of type (2). To each sequence 
corresponds a submatrix ( 0 ) whose dimension is equal to 
the number of vectors in the sequence. 


4. The last of the foregoing examples embraces all possible nil- 
potent transformations. This follows from the basic theorem (Theo¬ 
rem 1 ) which we will soon state and prove. 


5. Lemma. Given a system of vectors which is a union of several 
sequences. Now if the last vectors of alt these sequences constitute 
a linearly independent system, then the given system of vectors is 
also linearly independent. 

The proof is most conveniently carried out after the basic theo¬ 
rem. 

Theorem 1. For every nilpotent transformation there exists a 
canonical basis (which is by far not unique). 

Proof. We carry out the proof constructively, that is, we will 
actually show how to construct a canonical basis. Let B be a nil- 
potent transformation of height p in an n-dimensional space L. 

We consider the familiar sequence of inclusions 

... c^A''p = L 

where is the null space of the transformation B*. Construct 
the following subspaces: 

^2 = B {A\), X 3 = B2 {A\), Xp = B'’-' (J^p) 

We have B(X,) = 0. Hence, all X’ 2 , X 3 , belong to 

./f| (Jif, cz ./f 1 ). On the other hand, 

.y, H = B‘ (.4’,+,) = B'-'B (.4’,+,) cr B'"' (.4^,) = JiT, 

(since B(./L,+ i)c;.4’i). Thus 

j/Fp cz Xp_| cr ... cr JiTj cr ./V’l 


Let kj be the dimension of Xj and ni the dimension of ACi. 
Choose in ACi a basis whose vectors are denoted by 


np pp . pP—I pP —I • 

t-l . • • • > ‘ It > .> 


"p-l 


%+p 


,2 . p 

* 2 ’ % + P • • 


( 3 ) 


The semicolons separate any two groups of vectors, each of 
whicli is of a specific nature. The groups are indicated by super¬ 
scripts. 
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The choice of basis (3) is done under the following conditions. 
For the first group, ef, take any basis in Xp\ for the second 
group, ef"', take any complement ef to the basis Xp-\, and so on. 

Since el ^ Xp — B'’~' (Xp), all these vectors have inverse ima¬ 
ges under For the vector cf take some inverse image e\. 

We have ef = 5"“' (^ 1 ). At the same time for every i (/= 1, ..., kp) 
we obtain a sequence of length p: 

e'i, ei = B(el), ..., e? = B'’"'(ej); 

B'’(el) = 0 


Similarly, for kp < i^kp-i, the vectors ef-' have inverse ima¬ 
ges e'i under 6'’“^ because for these values of i we have 
e?“‘ e X'’~' = B'’~^ (A’p-i). Accordingly, for every i (kp < i < kp-^) 
we obtain a sequence of length p — 1: e\, e': = B(e\'), ..., = 

= B‘’“'(e!) = 0. Continuing this process, we obtain 

a system of vectors that can be written compactly in the fol¬ 
lowing array: 


el : 

1 - 

• > » 

^2 • gl • 
n, 

pp-1. 

* 

pp-2 • 
"P-I 

• e- • 
•> %> 

pi * 

w 

gP-2. 

3 . 

* ' 

( 4 ) 

■ 

t 

*p-! 




Here, in each row only the last representative of each group is 
written. For example, of the group ef, ..., el only eg is given 

in the first row. 

Some group may be nonexistent in (4). For instance, if Xp coin¬ 
cides with Xp-i, then in (4) the second group has to be crossed 
out. Also observe that if, say, X 2 = X 3 (*2 =/^s), then in the 
second row of (4) the group el^ drops out and the next row is of 
the same length. The sequences in (4) are arranged by columns 
and move upwards. The lower index may be regarded as the num¬ 
ber label of the sequence, the upper index as the number label of 
the vector within the sequence. Thus, the upper row accommodates 
the last vectors of the sequences. They constitute a basis in Jf\ 
and hence are linearly independent, whence, by the lemma, we 
conclude that all the vectors of (4) are independent. We now prove 
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Ihiit their total number I is equal to n, i.e., the dimension of L 
Clearly 

/ = /ii + Aij + Ats + ... + /ip (5) 

(note again that in (4) each column is actually a symbolical re¬ 
presentation of several columns). Let p* be the rank of the trans¬ 
formation B’' on the subspace Jfk+i (which is invariant under 
since B* (^ft+i) = Xh+\ czA’iCZ . Noting that A’k is the null 
space of B* in the subspace Jfh+i, we have 

rt| =/I 2 — Pl> ^2~PlI 
rt2 = ^3 — P2> ^3 ~ P 2 ; 


/Ip_| tip Pp—1> ^p Pp—1> 
np = n 

whence n=n\ -f- pi+ • • • + Pp-i =ni + ^ 2 + ... + kp. Consequently, 
I = n and the theorem is proved. 


Fig. 35 

e'z 


Fig. 35 is an illustration of the array in (4) for the case n = 4, 
p = 3, n, = 2, and the spaces X 2 and Xz are one-dimensional and 
coincide. 

Remark. If we wish to prove merely the existence of a canonical 
basis, it is possible to confine oneself to a briefer argument and 
take advantage of induction on the height of the transformation. 
Namely, let there be given in the space L a nilpotent transforma¬ 
tion B of height p -f 1 ^ 2. Then in the subspace J(\ = B(L) the 
transformation B has height p. In J(\ let a canonical basis be 
found (if p = 1, then any basis in will be canonical). We sup¬ 
plement the initial vectors of the sequences of this basis with in¬ 
verse images relative to B, thus lengthening each sequence by one 
vector. Then we complete the collection of these vectors of the se¬ 
quences to a basis in the subspace Jfi. This yields a system of 
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r, -f- ni = n vectors (ri = rank B = dimension J(\) that is inde¬ 
pendent by virtue of the lemma and, for this reason, forms a basis 
in L, which is clearly canonical. 

6 . Let us now return to the proof of the lemma. We can always 
assume that the given system of vectors is written as (4). Now 
form an arbitrary linear combination of all vectors of this system 
and equate it to the zero vector. In c.xpandcd notation we get the 
following equation, all sums in the left member of which are taken 
only over the lower index: 

Z + S af-'ef-' ... + S + Z a\e\ 

*p—I "l 


+ z + . • • + Z + ... 


“p 

''p-l 




+ S a\e\ 




*p-i 



+ Z =0 

b ‘ * 



t6) 


p 


Here are numbers (the coefficients of the linear combination). 
The arrangement of the sums corresponds to the array in (4). In¬ 
dicated under the summation symbol is the number up to which 
the summation is carried with respect to i, with the summation 
beginning with the number written under the preceding summation 
symbol of that row (in the first sums of each row the summation 
begins with unity). 

If some group of columns is absent in (4), then in the corres¬ 
ponding column of (6) we take the factors aj to be equal to zero. 

Now let us act on both members of (6) with the operator Bp-' 
to get from the last row 

Z alS"”' (e') = 0. 

or 

Za]^f = 0 (7) 

(all other terms of the sum (3), when operated on by the operator 
Bp-', yield 0). 

If the last vectors of all sequences form an independent system, 
then the part e’, .... el of this system is also independent. Then 

from (7) we have 

a| = 0, .... =0 

p 
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Now in (G) cross out the lower row and then operate on the re¬ 
mainder with the operator As before, we find that the scalars 
aj, a] that participate in the next to the lowest row are all equal 
to zero. Continuing the process, we find that in general all aj = 0. 
The proof of the lemma is complete. 


7. Let 6 be a nilpotent transformation with height p. Denote by 
tj the number of sequences of length j relative to some canonical 
basis of B. 

Theorem 2. For every j (/ ^ p) the number Ij is invariant rela¬ 
tive to a transition to another canonical basis of B. 

Proof. For the basis that was constructed in the proof of the 
preceding theorem, we have, by construction, 

lp = kp\ lj = k,— kj+x, 2</<p; /| = rt, — ^2 

Consider an absolutely arbitrary canonical basis. Observe that the 
last vectors of all its sequences must lie in A’,. Denote the total 
number of these vectors by n', the number lying in Xp by k'p, the 
number lying in Xp-\ by kp-\ and so on. 

We have 

(8) 


since in each case the number of independent vectors does not 
exceed the dimension of the space containing them. Let n' be the 
number of all vectors of the arbitrary basis under consideration. 
Then, by the proof of the preceding theorem (see (5)), 


n'= n\ k 2 -\- ... +^p, 
rt = «! + ^2 + • • • + 

Since n' = n, from (8) and (8a) we find n[ = n^, k'i = kf. But 

2<y<p: /;=<-*' 


(8a) 


Hence, l'i = li for any j. The theorem is proved. 

Remark. The gist of the proof of this theorem may be stated in 
two words thus: the dimensions kj and n\ of the subspaces Xj and 
X\ are invariant by the definition of these subspaces; but for any 
canonical basis, all 4 are expressible in terms of kj and «]. Hence 
also invariant are ail 4. 


8. It is easy to express 4 in terms of the ranks of the transfor¬ 
mations B^ on the given space L. 

Denote the rank of B' on L by rj. By the foregoing we have 


n/ = rt/+,-p/ 
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Yet 

nj = n-r,. — r/+| 

From these equations we gel 

9 i=^r,-r, ,1 

Thus, for 2< / < p, 

lj = kj — kj+ \ = p/_i — ()/ = r/_| — 2r/ + '■/LI 

Besides, 

/, = n — 2r| + r-i, Ip = fp-^ (9a) 

Remark. To each sequence of a canonical basis corresponds a 
Jordan submatrix in the matrix (2). Therefore formula (9) expres¬ 
ses the number 4 of Jordan submatrices of dimension k in the 
matrix (2) for all values of ^ (1 ^ ^ n). Also note that assum¬ 

ing ro = n (as the rank of B° = E) and = 0 for A: ^ p (since 
for fe ^ p we have 6* = 0), we can confine ourselves to (9) in¬ 
stead of (9) and (9a); 

lk = r^-i—2rk + rk+i (9) 

here we can take any k 1. 

9. Observe an obvious fact that will be used in the sequel. 

A transformation B is singular if and only if it has an eigen¬ 
value of zero. 

10. Theorem 3. For a linear transformation B in n-dimensional 
space L to be nilpotent it is necessary and sufficient that its cha¬ 
racteristic polynomial be of the form p{X) = (—A.)". 

Proof. Necessity follows from Theorem 1, since in the case of a 
nilpotent transformation B the characteristic matrix B — XE in the 
canonical basis has elements (— X) on the diagonal and zeros 
below, and so 

p(X) = dei{B-XE) = (-X)'' 

Sufficiency will not be proved here because it is a consequence of 
the following more general theorem. 

11. Let B be a singular transformation and let 

p(^) = (_l)''X'"'(A,-... {X-Xifi (10) 

be its characteristic polynomial. The roots X 2 , ..., A., (in general, 
complex) are all distinct. 

By the theorem of Section 8 wo have 

L — 


( 11 ) 
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with B nilpotent on L\ and nonsingular on L 2 (Lt and L 2 are in¬ 
variant under B). 

Theorem 4. The dimension of L\ is equal to the multiplicity mi 
of the zeroth root of the polynomial p{K). On the subspace L 2 the 
transformation B has characteristic polynomial P 2 (^) = (~n”~"‘‘X 

x(x-A2r^... a-x/p. 

Proof. By theorem 3 of Section 7 we have, according to (11), 

p{X,) = Piik) PiiX) (12) 

Let ni be the dimension of Li. By Theorem 3 (i.e., by the portion 
already proved), 

p,(^) = (-l)'“r‘ (13) 

Comparing (10), (12) and (13), we find: ni ^ mi. On the other 
hand, if rti < mi, then P 2 (A.) must have a zero root of multiplicity 

mi — ni > 0. But this is impossible since the transformation B is 

nonsingular on L 2 . Thus, n\ — mu from which and also from (10), 
(12) and (13) follows the second assertion of the theorem. 

Remark 1. It is clear that the sufficiency in Theorem 3 is a spe¬ 
cial case of Theorem 4 for mi = n. 

Remark 2. Denote by p the height of the transformation B in Lu 
We have p ^ mi since the height of the transformation does not 
exceed the dimension of the space. On the other hand we know 
that Li may be defined as the null space of the transformation S'* 
for any k'^ p. Therefore, if the multiplicity mi of the zeroth eigen¬ 
value is known, then we can find Li as the null space of the trans¬ 
formation S'"' without computing p. 


§ 10. Reducing a transformation matrix 
to the Jordan normal form 


I. Definition. We say that matrix A has a Jordan normal form 
if Jordan submatrices occupy the main diagonal with zeros else¬ 
where: 


A = 


G/k.(A,i) 


0 


Gk, (A,2) 


0 

Gktih) 


(I) 


The possibility is not precluded that in matrix (1) ki = kj or 
Ki = Xj for certain i, /. 
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Recall that every Jordan submatrix is a X *i matrix of the 
form 


Gk^ (h) = 


Xi 1 

0 


1 


kl 1 

0 

Xi 


A Jordan submatrix of order one consists of the single scalar ki'. 

Gi(A.,) = ll^.||, G,(0) = ||0|| 

2. Theorem. In an n-dimensional complex space L, every linear 
transformation A has a basis relative to which the matrix of the 
transformation has a Jordan normal form. When passing to 
another analogous basis, the matrix A is preserved up to a permu¬ 
tation of the submatrices. 

The basis mentioned in the theorem will be called canonical. This 
term conforms with the terminology of Section 9: the case con¬ 
sidered in Section 9 is obtained when all Ki = 0. 

Remark. The proof of this theorem for nilpotent transformations 
is given in Section 9. The preceding results permit reducing the 
study of the general case to a consideration of nilpotent transfor¬ 
mations. 

The proof is given below together with auxiliary propositions. 

3. Auxiliary propositions. Given a linear transformation A in an 
n-dimensional space L. Set 


A-aE=R 

where a is a scalar. 

Lemma 1 . If B is nilpotent in L and hence has a canonical basis, 
then, relative to this basis, A has a Jordan normal form (1), where 
all li = a. 

Proof. In a canonical basis, the transformation B has the matrix 


Gk,(0) 

0 

GkAO) 

• 

0 

« 

* G,^(0) 


(2) 
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This same basis ei, ..., e„ canonical for A because of the iden¬ 
tity 

Gki (a) = Gk^ (0) -f a£ 


whicli, written out in full, looks like this 


a 1 

a 1 

0 


0 1 

0 1 

0 

+ a 

1 

1 

0 

0 

a 1 

a 


0 

0 1 

0 


0 

’ 1 


Because of (2) and (3), the matrix of transformation A relative to 
the basis e\ .has the Jordan normal form 


Gft.(a) 

0 

0 

G*, (a) 


Lemma 2. Let Xi, .... Xj be the roots of the characteristic poly¬ 
nomial of the transformation A. Then the transformation B = 
= A — aE has a characteristic polynomial whose roots are 
Xi — a, Xj — a, the multiplicity of the root X, — a being equal 
to that of the root Xi- 

Proof. Lemma 2 follows from the identity 

det(S - XE) = det {A - aE -XE) = piX-^ a) 

Lemma 3. A subspace L is invariant under the transformation 
B = A — XE if and only if it is invariant under the transforma¬ 
tion A. 

Proof. (1) Let £ be invariant under A. This means that if 
then Ax ^ t and then 

Bx = Ax — Xx ^ L 

(2) If L is invariant under B, then it is also invariant under A 
since A = B —(— X)E. 

4. Proof of the theorem. We now prove the existence of a canoni¬ 
cal basis for an arbitrary linear transformation A specified in an 
n-dimensional complex space L. 
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Let A,i, ..., Xj be distinct roots of the characteristic polynomial 
p{K) of the transformation A so that 

= ... (k-k,ri (4) 

where rtii is the multiplicity of the root A,,- (t = 1, 2, ..., /), mi 4- 
-j- m 2 + ... + mj = n. Consider the singular transformation B\ = 
— A — 'k\E. Denote by L\ the null space of the transformation 

= ... = (/I-X,/:)"'■ 

and put BT'{L)=L. We know that 

L = L,®L (5) 

with 6i nilpotent in L|. By Theorem 1 of Section 9, there is a basis 
in Z-i canonical for Bj. The same basis is canonical also for A 
considered in Li (Z-i and L are invariant under A by Lemma 3). 
In accord with the expansion (5) we have 

p{X)==Pi {%)p{K) 

where, by Theorem 4 of Section 9, pi (A,) = (—— ^i)”' and 
p (A,) = (- I)"-'"! (A, - ... (A, - Xifi. 

We now consider A in the invairant subspace L and, arguing as 
before, obtain 

Z = L2©^ (6) 

where L 2 is an invariant subspace of dimension m 2 in which there 
exists a canonical basis for the transformation B 2 = A — KzE and 
also for the transformation A. The subspace L 2 is defined as the 
null space of the transformation 

= B 2 B 2 ... B2 = {A- K2E)"'^ (7) 

considered in L. However, if we consider the transformation (7) 
throughout the space L, its null space should contain L 2 and 
have the same dimension m 2 . Therefore L 2 is the null space of (7) 
considered throughout the space L. From (5) and (6) we get 
L = Li 0 L 2 © £. 

Continuing this process, we arrive after the /Ih step at the 
expansion 

L = Z.,©L2© ... ©L; (8) 

where L,- is the null space of the transformation= (4 — A.^L)'"'; 
the dimension of Lj is equal to m,. The transformation A has a 
canonical basis in each of the L,. The union of these bases yields 
the desired basis of the entire space L. 



24f> LINEAR TRANSFORMATIONS OF SPACES (CH. VII 

Remark. The foregoing proof indicates a procedure for actually 
reducing A to the Jordan normal form and indicates a method for 
finding a canonical basis. But it is also possible to write the Jordan 
form of the transformation A without constructing a canonical 
basis. This possibility is a consequence of the following subsec¬ 
tion. 

5. We here prove the unique definiteness of the Jordan normal 
form of the matrix of the given transformation. Let the canonical 
basis ei, ..., e„ be found. In it the matrix of the transformation A 
has the form (1). Suppose that the Jordan submatrices correspond¬ 
ing to X| are located in the first r rows of matrix (1). Then the 
linear hull of the first r basis vectors forms a subspace Li in ex¬ 
pansions of the form (5) and (8): 

£.| = /.(&!, e^), L,^=L(ef.y\, •••, ••• @1^/ 

The transformation Bi = A —XiE is nilpotent in Lj. The basis 
ei, ..., er of subspace Li is canonical for Bj. The number and 
length of the sequences of this basis (relative to Li) are equal to 
the number and dimensions of the Jordan submatrices correspond¬ 
ing to the number Xi in the canonical matrix of the transforma¬ 
tion A. The number of sequences of different length in the ca¬ 
nonical basis of a nilpotent transformation is determined from 
formula (9) of Section 9. As applied to this case, the quantities 
r* are equal to the ranks of the transformations Sf considered 
in Li, or, what is the same thing, to the dimensions of the sub¬ 
spaces Bf(L|). We will show that in (9), Section 9, we can put in 
place of Eft the ranks of the transformations Bf considered in the 
entire space L. Let Ru be the rank of B? in the space L. Then Rh is 
equal to the dimension of B? (L). 

Since B\ is nonsingular in subspace £, we have 

z: = B,(Z;)-Br(L)= ... =bUL) 

By Subsection G, Section 4, we find 

B\ (L) = Bf (L, © L) = Bf (Z.,) © Bf (L) = B? (L,) 0 r (9) 

Denote by s the dimension of L. From (9) we get 

= r* + s 


so that (since s is independent of k) we have 

Rk-i — -F Rit+\ == fk-i 2rfc + r*+i 


(the right member of (10) enters into (9), Section 9). 


(10) 
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Similar arguments apply to the other singular transformations 
B, = A —%iE. 

Conclusion. Let be tlie number of Jordan submatrices of di¬ 
mension k'^ \, corresponding to the eigenvalue X,-, in the matrix 
of the given transformation A written in the canonical basis. Then 

Ik = rank - 2 rank {A - + rank {A - X.-f)*"^' (11) 

All the terms in the right member of (11) arc independent of the 
choice of basis. The proof of the theorem of Subsection 2 is com¬ 
plete. 

6 . Let a canonical basis be found. Then each of the subspaces £,• 
of (8) is represented in the form of a direct sum of invariant sub¬ 
spaces, in each of which the transformation is specified by a single 
Jordan submatrix. Polynomials of the form (X — X,)'', which, up 
to sign, are equal to the characteristic polynomials of these Jordan 
submatrices, are called elementary divisors of the matrix of the 
transformation A ; k is the order of the Jordan submatrix. 

The number of elementary divisors of a given power k with a 
given eigenvalue X,- is equal to the number li (see (11)). In matrix 
theory it is demonstrated that elementary divisors can be computed 
if one knows the greatest common divisors of order-s minors of 
matrix A — KE for s = 1, ..., n. We thus have another method for 
finding the Jordan normal form of matrix A. 

7. The theorem of Subsection 2 is true in real space on the as¬ 
sumption that all roots of p(X) are real. The proof is that given 
in Subsections 3 to 5. 

8 . If in a real space L a transformation A is given in which 
certain roots of the characteristic polynomial are complex, then in 
place of (4) we have 

p(X) = (-l)''-"‘(X-X,)'”> ... (X-X,)'”*p(X) (12) 

where p(X) is a polynomial of degree m devoid of real roots, 
m\-\r -h ^ By (12) we get an expansion of the 

space L into invariant subspaces: L = L| 0 ... 0 Lh 0 £, where 
Li is a subspace of dimension m, (the null space of the transforma¬ 
tion {A — KiE)'"‘), in which A has one eigenvalue X,- and C is a 
subspace of dimension m, in which A is nonsingular and does not 
have a single eigenvector. A canonical basis can be chosen in each 
one of the In L we choose a basis at pleasure. Then the mat- 
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rix A becomes 


G<'» 

0 

0 

G'*> 

A 


where G<'), ..., G<^> are Jordan normal forms of the matrices of the 
transformation A in the subspaces L\, ..., Lh and ■4' is a nonsin¬ 
gular m by m matrix. The question of further simplifying the 
matrix A (via a special choice of basis in £ that simplifies the sub¬ 
matrix A) will not be considered. 

§ II. Transformations of a simple structure 

1. Definition. A linear transformation A in space L is called a 
transformation of a simple structure if there is a basis in L consist¬ 
ing of the eigenvectors of that transformation. 

in the case of a transformation of a simple structure, the Jordan 
normal form of the matrix consists of one-dimensional Jordan sub¬ 
matrices. Actually we have already dealt with transformations of 
a simple structure in Subsection 5 of Section 7. 

We now give two criteria for the existence of a basis of eigen¬ 
vectors. 

2. First criterion (sufficient).// the characteristic polynomial of 
a linear transformation A of a complex space L does not have any 
multiple roots, then there is a basis in L made up of eigenvectors 
of A. 

Proof. Under the conditions of the criterion, the expansion (8) 
of Section 10 consists of n distinct one-dimensional invariant sub- 
spaces /,.£„. Here, each £,■ is a linear hull of the eigenvec¬ 
tor Cj. By Section 14, Chapter 1, the vectors ei, form a basis 

in L. 

3. Second criterion (necessary and sufficient). In a complex 
space L there exists a basis of eigenvectors of the transformation A 
if and only if for each root A,, of the characteristic polynomial p{X) 
the rank of the matrix A —A,,£ is equal to the difference n —m,-, 
where m, is the multiplicity of this root and n is the dimension 
of L. 

Proof, (t) Necessity. Let there be a basis of eigenvectors. Rela¬ 
tive to this basis, the matrix of the transformation A is diagonal 
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(see Section 7, formula (3)) and the characteristic matrix is of the 
form 

k^ — 'K Q 

^2 - k 

A-kE= (1) 


so that 

p{k) = Aei{A-kE) = {k^-k)(,k2-k) ... (k„-k) 

If, for example, ki is of multiplicity m,, that is, 

k\ = k2= ... ^ A,m,, 

kmi + \^ki, ..., kn^k\ 

then for X = the first m\ elements are equal to zero on the dia¬ 
gonal of matrix (1) while the remaining elements are nonzero, 
and so 

rank(y4 —X|£) = « —m, (2) 

Because of the invariance of p(k) and of the rank of the charac¬ 
teristic matrix, (2) is independent of any choice of basis. 

(2) Sufficiency. By Subsection 4, Section 10, the dimension of 
the subspace L,- is equal to the multiplicity m,- of the root X,-. If 

rank (.4 ~ X|£') = rt — mi (3) 

then to the eigenvalue ki correspond nti linearly independent eigen¬ 
vectors (see Section 6, Subsection 2). They all lie in the subspace 
Li and form a basis there. If (3) holds for every i, then the union 

of such bases for all i = 1. j yields a basis of the space L 

(see Section 14, Chapter I), which basis consists of eigenvectors. 
This completes the proof of the second criterion. 

Remark. Under the conditions of the second criterion, the trans¬ 
formation A acts in each of the Li like a similarity transformation 
with coefficient X,- (in this connection, see Section 7, Subsection 4). 

4. Both criteria hold true in real space under the supplementary 
condition that all roots of the characteristic polynomial be real. 

The proof is left to the reader. 

5. From the results of Section 10 it follows that an arbitrary 
linear transformation -4 in a complex linear space L (and also in 
real space, provided that p{k) has only real roots) may be given 
in the form of the sum 

A = B + C 

where B is a nilpotent transformation and C is a transformation 
of a simple structure (see (1) and (2) in Section 10). 
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§ 12. Equivalence of matrices 

1. Definition. Two ny^n matrices A and B are said to be equi¬ 
valent if there exists a nonsingular n X« matrix Q such that 
B — QAQ-' (the matrices A, B, and Q are either all real or all 
complex). 

The geometric significance of this definition consists in the fol¬ 
lowing: if A is regarded as the matrix of some linear transforma¬ 
tion relative to an arbitrarily chosen basis e[ .e„, then B spe¬ 

cifies the very same transformation relative to another basis 

e^ .with Q=(P*)“‘, where P is the matrix of the right 

members of the formulas 

ei' = S P'r^i 

(see (la) in Section 2). 

It is readily verified that if A is equivalent to B, then B is equi¬ 
valent to A, and that two matrices which are separately equivalent 
to a third are equivalent. Thus, the entire collection of matrices 
(real or complex) breaks down into disjoint classes of equivalent 
matrices. 

2. Let // be a subgroup of matrices. If in defining equivalence 
we take matrices Q and H, we get matrices that are equivalent 
relative to the given subgroup H. Distinct matrices equivalent to 
a given one relative to H express one and the same linear trans¬ 
formation with respect to distinct bases belonging to one class of 
bases relative to the subgroup H. 

3. From the results of Section 10, it follows that for every com¬ 
plex ny n matrix A there exists an equivalent matrix G having a 
Jordan normal form. A permutation of the submatrices in G car¬ 
ries G into an equivalent matrix G', since a transition from G 
to G' is associated, geometrically, with a permutation of certain 
sets of vectors relative to a given basis. The process of finding 
a matrix G equivalent to A is called reducing A to the Jordan nor¬ 
mal form. 

Two matrices, tlic Jordan normal forms of which differ in eigen¬ 
values, number or dimensions of the Jordan submatrices, are nftt 
equivalent. 

4. If for a given matrix A we know the Jordan normal form G, 
then Q in the equation 

G = QAQ-' (1) 

can be found in the following manner. Postmultiply both sides of 
(1) by Q and transpose all terms to one side to get 

GQ-QA = 0 


( 2 ) 
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which may be viewed as a homogeneous system of linear equations 
in which the unknowns are elements of Q. Any solution of such 
a system that satisfies the supplementary condition 

dctQ^O (3) 

yields the desired matrix Q. This method is very cumbersome for 
large n since (2) contains equations. 

Other methods have been elaborated for finding the matrices G 
and Q from a given matrix A. One of them was actually described 
in Sections 9 and 10 (also, see, for example [8]). 


5. Examp 
4 1 

-I 6 
Solution. 


e. Reduce to Jordan normal form the matrix A = 


'orm the characteristic polynomial: 

4-\ 1 

-1 6-K 


p(k) = 


= - lOA, + 25 


whence A,i = A ,2 = 5. We then find 


A-hE = 


-1 1 

-1 1 


, rank (A — A,|£) = I 


The sum of the multiplicity of the root X,i and of the rank of 
{A — A,i£) exceeds n = 2 and so there is no basis of eigenvectors 
for a transformation with matrix A. In such a situation (n = 2, 
is a multiple root, there is no basis of eigenvectors), there is only 
one possibility for a Jordan normal form: a two-dimensional Jordan 
submatrix corresponding to the given root Xi = 5: 


G = 


5 1 
0 5 


(4) 


We can reason differently. It is easy to compute that 


(/ 1 - X ,£)2 = e 


and so the ranks of successive nonnegative powers of the matrix 
{A — Xi£) form the sequence 

To = 2, r, = 1, r 2 = rj =...== 0 

Putting fi in (11), Section 10, we find that the number of one¬ 
dimensional Jordan submatrices in matrix G is zero, the number 
of two-dimensional ones equals unity, in agreement with (4). 
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Mow 

let us 

Qii 

Qi2 

Q 2 I 

Q 22 


into (2), we get the system of equations 


Qii Qi 2 "t" Qii 

— Qll ~ Qi 2 +Q .<2 = 0 > I 

Qil + Q'2 “ I 

Q21 Q22 ^ 


Q= 


( 5 ) 


(All indices are written as subscripts since the tensorial nature of 
the formulas is immaterial.) The last two equations of (5) are a 
consequence of the first two, from which we find 

Q.i=«. Q\2 = b, 

Q 21 — — — b, Q 22 = <J “j- 6 


where a, b are arbitrary scalars. We have to ensure the condi* 
tion (3): 


detQ = 


a 

— a — b 


b 

a-\-b 


= {a + bY¥^0 


whence a ^ — b. No other restrictions are imposed on a and b. 
For instance, taking a = 1 and 6 = 0, we get 


Q = 



0 

1 


It is easy to verify that (I) holds true. 


§ 13. The KamiIton-Cayley formula 

1 . A direct consequence of Section 10 is an identity known as the 
Hamilton-Cayley formula. 

Let 


/>(A,)==(—1)"(a,"- f '+ ... +p„_,A, + p„) 4 

be the characteristic polynomial of the linear transformation A. 
Then p(A) is the zero linear transformation. Written out, 

A'‘ + p,A''-'4- ... +p„-,A + P„£ = 0 (1) 

2. Proof. Here the space will be taken to be complex. By Sub¬ 
section 4, Section 10, we have 

p ik )={- ir ( k - ...{ k - k,ri 


( 2 ) 
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Accordingly 

p (A) = (-1)" (A - ... (A - KiEf'i (3) 

which is easy to see if we simultaneously multiply together the pa¬ 
rentheses in the right members of (2) and (3). Using the notation 
of Subsection 4, Section 10, we can write, in place of (3), 

p(A) = (-l)"/3';'' ... b;/ (4) 

The order of the factors in (3) and (4) is immaterial because here 
we have only products of the operators A and E, which are com¬ 
mutative. 

Let X be an arbitrary vector in L. Since L is the sum of the L,-, 
we can write the expansion 

a: = x,- f ... -fx/, where Xi^Li, i=l, j (5) 
On the other hand, by the definition of B, and Li we have 

B;"'(L,) = 0 (6) 

It is evident now that 

p(A)jc = 0 

due to (4)-(6), since in (4) the factors can be written in any order. 
Formula (1) is thus proved. 

The Hamilton-Cayley formula holds not only in complex space 
but in real space as well since a real space can always be extended 
to form a complex space. More explicitly, if we have a given basis, 
we can permit a consideration of vectors with complex compo¬ 
nents (coordinates); then a linear transformation A naturally ex¬ 
tends to the resulting complex space (its matrix A must be left un¬ 
changed). 
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§ 1. Scalar products 

1. Let L be a real linear space. In L we introduce a new opera¬ 
tion called the scalar multiplication of vectors. 

Scalar multiplication assigns to each pair of vectors x, y o\ L 
a real number denoted by (x, y) and called the scalar product oi 
vector X by vector y. 

By analogy with elementary analytic geometry we require that 
the following properties hold true: 

(1) commutativity, {x, y) = {y, x)\ 

(2) distributivity, (x, + Xj, y) = (x,, y) + (xj, y ); 

(3) homogeneity, (ax, y)=a{x, y) for every real number (sca¬ 
lar) a; 

(4) nonsingularity, if (x, t/) = 0 for a fixed x and any y in L, 
then X = 0. 

Here x, y, X\, x^ are always arbitrary vectors of space L. 

2. Notice that in elementary analytic geometry these properties 
of a scalar product are proved as theorems, while we regard them 
as axioms and include them in the definition of a scalar product. 

3. The second and third properties together signify linearity of 

the scalar product in the first argument. Because of commutativity, 
we have linearity in the second argument as well. * 

Thus, a scalar product (x, y) is a bilinear form which is sym¬ 
metric by the first property and nonsingular by the fourth pro¬ 
perty. Indeed, the fourth property means that the zero subspace 
of a bilinear form (x, y) is zero-dimensional, whence follows its 
nonsingularity (see Section 11 of Chapter IV). 

4. Clearly, the converse is true as well. 

Every nonsingular symmetric bilinear form g{x, y) specified in 
a space L may be taken for a scalar product by putting 

(x, y) = g{x, y) 


for any x, y ^ L 
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Remark. Naturally, a scalar product depends on the choice of 
the form g{x, y). If we choose different forms for the scalar pro¬ 
duct, then for a given pair of vectors x, y in L the scalar product 
will in general receive different numerical values. 

5. Let a scalar product (.v, y) = g{x, y) be introduced in a 
space L. 

Assuming the space to be n-dimcnsional, we take an arbitrary 
basis ei, .... e„. U x= Y. 11= Z .'At. then the scalar pro¬ 
duct will be written in terms of components (coordinates) as fol¬ 
lows: 

{x,y) = g(x, y) = E gikX^y’' (1) 

where g,* are the coefficients of the bilinear form g(x, y) relative 
to the given basis ei, .... e„. They are the values of the form on 
the basis vectors. Thus 

(e/, Ck) = gik (2) 

and gih = ghi- The equations (2) constitute a multiplication table 
of the basis vectors. 

If the right members of (2) are given, then the scalar product 
of any pair of vectors x, y is uniquely determined (according 
to (1)). 

6. Definition 1. The vectors x, y are called orthogonal if 
{X, y) = 0. 

In terms of components, the orthogonality condition of the vec¬ 
tors X, y is of the form 

E gikX^x'^ = 0 

Definition 2. Vector x is orthogonal to a subspace L' if {x, y) =0 
for any y e U. 

Note that if U has dimension k, then for the orthogonality of x 
to the subspace L' it is sufficient that x be orthogonal to any k in¬ 
dependent vectors lying in L'. Indeed, if the independent vectors 

Cl, ..., Oft lie in L' and if (x, ai) = 0.(.<, a;,) = 0, then for 

any g e L we have y = -j- ... -f- V'a/,, whence 

(x, y) = V {x, a,) + ... + (a:, a*) = 0 

Definition 3. Subspaces L', L" are said to be orthogonal if 
(jc, y)= 0 for any x e L' and any y e L". 

Definition 4. Subspace L" is called the orthogonal complement 
of subspace L' in L if L' and L" are orthogonal and their direct 
sum coincides with L. 
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Remark. It is to be stressed that the orthogonality of vectors 
and the orthogonality of subspaces depends essentially on precisely 
which bilinear form g{x, y) is taken as the scalar product {x, y) 
in space L. 

§ 2. The norm of a vector 

1. Given a scalar product in a linear space L. 

Definition. The norm of a vector x is the number 

iuii= + Vu.-^) (1) 

The norm is a generalization of the concept of the modulus (ab¬ 
solute value) or length of a vector known from elementary geo¬ 
metry. 

The scalar product {x, x) is a real number but it may not be po¬ 
sitive, so that the norm of a vector may prove to be imaginary. 
We make the convention that the radical in (1) can be either a 
nonnegative real number or an im aginary number having a posi¬ 
tive multiplier with / (/ = -f V~ 0- 

2. From the definition of a norm it follows that 

II oa: II = 1 a I • lU II 

for any x^ L and any scalar a. 

In particular 

II -a: II = 11 a: II, ||0||=O (2) 

Nonzero vectors whose norm is equal to zero are called isotropic. 
Isotropic vectors exist if and only if the quadratic form (a:, x) is 
not of fixed sign. 

3. The quadratic form || a: H* = (x, x) is called the metric form 
of the space under consideration. 

It is determined by the bilinear form {x, y) and in turn*defines 
it as its polar form. Thus, specification of a scalar product and spe¬ 
cification of a quadratic form are equivalent as far as measuring 
the norm of vectors goes. For this reason, spaces with a given 
.scalar product are also called spaces with a quadratic metric, or 
inner-product spaces. 

If the space is n-dimensional, then the metric form expressed in 
terms of components is 

II a: ip = (a:, a:) = 2^ giftAcV 

4. Theorem, If the metric form is positive definite, then for any 
two vectors x, y ^ L we have the inequality 

lU + I/IKIUII + lli/ll (3) 
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Proof. Take advantage of the Cauchy-Bunyakovsky inequality 
(see Section 10, Chapter IV) 

{x, x)’(y, y) (4) 

Taking into account (4), we get 
lU + i/ IP = + </, x-\- y) = {x, a) + 2 {x, y) + {y, y) 

^{x, x) + 2V(^. x)-(y, »/) + (//- */) = (IUII + lli/ll)^ 
whence follows (3). 

Remark. From (3) it follows that if the metric form is positive 
definite, then 

IU-«/ll>IUII-lli/ll, IU + i/ll>IUII + lli/ll 

5. Let us consider an affine space 21 to which corresponds a 
linear space L with quadratic metric. 

For each pair of points A, B \n ^ we define the distance p(;4, B) 
and assume it to be equal to the norm of the vector AB-. 

p(/l, B) = ||:4fl|| (5) 

We have 

p{A, B) = p{B, A), p(^/l) = 0 (6) 

Formulas (6) follow from (2) and (5). 

6. In the case of a positive definite metric form {x, x), the di¬ 
stance between points is zero if and only if the points are coinci¬ 
dent and, besides, for any three points A, S, C in 21 the triangle 
inequality holds: 

p(^, C)<p(^ B)-)-p{fl, C) (7) 

Inequality (7) follows from inequality (3) and formula (5). 

7. If the distance between points of the affine space 21 is defined 
by formula (5), then we say that a quadratic metric is specified in 
the affine space 21. Expressed in terms of affine coordinates, the 
square of the distance is 

p2 {A, B) = Z gik {xf. - -vf) (x^ — xf) (8) 

where xj, ..., xf are the affine coordinates of the point A and 
x^, .... xj are the affine coordinates of the point B, 

The right member of (8), which is quadratic with respect to the 
differences of the coordinates of the arbitrary points A and B, i.s 
called the metric form of the space 21. 


9-661 
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§ 3. Orthonormal bases 

1. Tlie bases in a quadratic-metric space are not of the same 
status. They include some that are most convenient from the view¬ 
point of the given metric. 

I'or instance, the basis ei, e„ can be chosen so that the 
metric form g(x, x) is normal relative to this basis; 

II .V IP = g (a:, a:) = + • • • + (x'‘Y — {x '‘+— ... — (x")^ 

Then the scalar product of two vectors can be represented thus; 
xy = x'y'-\- ... -f— AC*+'«/*+‘ — ... — ac"!/" 

It is clear that the scalar products («<, ej)= 0 if t /', that is, 
for i # /' the basis vectors are orthogonal. Here, || IP = 1 if 

t = I. k; II Ci IP = —1 if t = -|- 1, ..., n. Thus the vectors 

of the basis are normalized so that the squares of their norms are 
in absolute value equal to unity. The vectors e,- are called unit vec¬ 
tors if i ^ k, and imaginary-unit vectors if i ^ /c -f- 1. Generally, 
a vector a is called a unit vector if || a IP = 1 and an imaginary- 
unit vector if || a IP = —1. 

Definition. A basis eu ..., that satisfies the conditions enu¬ 
merated in this subsection is said to be orthonormal. 

Theorem \. In an n-dimensional linear space with a given quad¬ 
ratic metric, any selection of n pairwise orthogonal unit or ima¬ 
ginary-unit vectors is a basis in which the metric form is normal. 

Proof. Let ei, ..., be such an indicated selection of vectors. 
Let us be sure that they are linearly independent. We consider the 
relation 

A.|e, -p ^,2^2 + ... + = 0 

whence, forming the scalar product by means of ci, we get 

^l) + ^2(®2. ®l) + ••• +^n(^n. ei)=(0, ei) 

But by hypothesis, (ei, e,)=±l, (cj, ei)=0 {f ^ 1); besides, 

((),(’i) = 0. Hence ?,, = 0. Similarly we prove that ^2 = 0. 

in =0. We thus establish that the vectors ei, ..., e„ are indepen¬ 
dent and, hence, do actually constitute a basis. 

Since g{ei, e,) = (e.-, e.) = ± 1, g{ei, ej) = (e„ ej) = 0, the form 
g(x, x) is normal relative to the basis ei . e„. 

2. Along with Theorem 1, we note the following assertion. 

11 is always possible, in n-dimensional linear space, to specify 
(in uni(|uc fashion) a quadratic metric such that an arbitrary 
preassigned basis ci, ..., e;„ Ch+i, . ■., e„ will be orthonormal, the 
vectors C|. c/, will be unit vectors, and the vectors e/,+i, ...,«« 
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will be imaginary-unit vectors. Here, k is also any preassigned in¬ 
teger from 0 to n. 

Proof. The desired metric is uniquely determined by specifying 
the metric form Jc), which, relative to the basis eu 
. ^n. is 

g(.K, A:) = (;t')2-1- ... ... _ (;c'‘)2 

3. By the law of inertia of quadratic forms, the number of unit 
and the number of imaginary-unit vectors is independent of the 
choice of basis that is orthonormal in the given quadratic metric. 

Definition. The number k of unit vectors of an orthonormal ba¬ 
sis is called the positive index of the space with given quadratic 
metric. 

If ^ = n or if ft = 0, the space is called Euclidean. 

If 1 ^ ^ ^ n—\, the space is called quasi-Euclidean. 

Of particular importance is the quasi-Euclidean space when k = 
= n —1. It is called Minkowski space and for n = 4 plays an im¬ 
portant role in relativity theory. 

§ 4. Orthogonal projection. Orthogonalization 

I. In this section we consider a Euclidean space L, that is, a 
linear space with a metric form of fixed sign. We consider the 
metric form to be positive definite. (The case of a negative definite 
metric form does not require separate consideration. This will be 
clear from Section 5.) The space L may be infinite-dimensional. 

Let a subspace U be given in L. Suppose that a vector a: e L is 
given as the sum 

x = x'-\-x (1) 

where x' and x is orthogonal to L'. Then the vector x' is said 
to be the orthogonal projection of the vector jc on the subspace U. 
The orthogonal projection of x on L' is unique. Indeed, suppose 
there is another expansion x = x\-{-X^, where x'^eL', and Jr, 
is orthogonal to L'. Then x' — x'^—x^ — x, whence 

(x' — Ac')* = (x' — x\, x' — Ac') = (Jr, — Jc, x' — a:') = 0 (a) 

since x' — x\^L' and Jc and Jc, are orthogonal to L'. From (a) 
it follows that a:' —a:' = 0, or x^ — x\, since the metric form of 
the space is positive definite. 

The special case where L is three-dimensional and U is two-di¬ 
mensional is shown in Fig. 36. 

The transformation of space L which assigns to each vector x 
a corresponding vector x' by formula (1) is also called an ortho¬ 
gonal projection on U, 


9* 
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If L is viewed as a point space and U a s a plane in that space, 
then the point M' with radius vector OM' = x' is the orthogonal 
projection on L of point M with radius vector OM = x (Fig. 36). 

2. We now demonstrate that the orthogonal projection M' of 
point M on L' is the point in L' closest to M. 

Let !/ = ON be an arbitrary vector of the subspace L'. We have 
to prove that 

\\x-y\\>\\x\\ (2) 

equality in (2) occurring if and only if y = x' (that is to say, 
when N is coincident with M', Fig. 36). 


Fig. 36 


Set x' —y = y\. Then x —y = x y\ and 
\\x— y\f = {x-\- yu x +i/i) 

= IUlP + llf/ilP + 2(Jf, y,) = ||JE|P + ||^,|P (3) 

since {x, y\)= 0 because of the orthogonality of the vector x to 
the subspace L' that contains y'. 

Observe that 

\\y^ f=(y^, f/.)>o 

since the metric form of the space at hand is positive definite. The¬ 
refore (2) follows from (3). Equality is attained in (2) if and only 
if //I = 0 (that is, when y = x'). 



3. Let 


L' = L(zu .... 2ft) 


where zi, ..., 2 ft is a finite independent set of vectors in L In 
this case, to find the orthogonal projection x' of the given vector x 
on the subspace L' it suffices to make a suitable computation of 
the coefficients ai, ..., as in the expansion 

x' = a| 2 | -f ... + OftZft (4) 

To do this we write down the orthogonality condition of the vector 
jc -v — x' to each of the vectors zy. 

{x — x', Zj) = 0 


( 5 ) 
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Substituting the expansion (4) into (5) and taking advantage 
of the properties of a scalar product, we obtain for at a system of 
linear equations: 

k 

Z (z«. 2/)= (-'f. 2/). j=\, k (6) 

The determinant of system (6) is the Gram determinant for the 
positive definite quadratic form (x, x) and (he independent vectors 
Z|, Zu. It is therefore positive (see Section 10, Chapter IV) 
and system (6) is uniquely solvable. The desired projection will 
thus be found. 

4. We will need the following lemma later on. 

Lemma. Given, in a space with a positive definite metric form, 

a system of pairwise orthogonal vectors a\ . ak, that is, 

(a,-, aft)= 0 for i k. If none of these vectors is a zero vector, 
then they are linearly independent. 

Proof. Consider the relation 

‘k\a^ 4- • • • + l-kO'k = 6 (7) 

From the scalar product of (7) by oi: 

^1 (fli. ai) + ^2(ai, Oz) + ••• + A,*(a,, a*) = (a,, 0) (8) 

Since a\ ^ Q and the metric form is positive definite, it follows 
that (fli, «])= II fli P ^ 0. The remaining scalar products in the 
left member of (8) vanish under the hypothesis of the lemma; 
(fli, 0)= 0 because of the participation of the zero vector. Hence 
^ = 0. Similarly we demonstrate that Xz = ... = X;, = 0. The 
proof of the lemma is complete. 

5. Given in space L an ordered set of linearly independent vec¬ 
tors e\, ..., ek. We will now discuss replacing this set by another 
set of vectors that is orthogonal to and, in a certain sense, equi¬ 
valent to the given set. For this we carry out a geometrical con¬ 
struction called the process of orthogonalization. It resembles the 
process of choosing a basis when reducing a quadratic form to 
canonical form by the Jacobi method. 

A new system of vectors ey, ..., e,,' is constructed with the fol¬ 
lowing conditions observed: 

(1) e\’ e L (^i), e^ ^ L (e|, e-^ ..., Cf ^ L{e\, ..., et), ..., e 

^ Z. {ei , ..., Cif), 

(2) the vectors ei', ..., arc pairwise orthogonal: 

(3) the set e^’ .e*- is linearly independent. 

In that case we say that the new set of vectors is obtained from 
the original set e\, ..., e^ by the process of orthogonalization. 
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If the given system consists of three vectors et, e-i, ^3 in three- 
dimensional Euclidean space, then we construct the new system 
^r. ^ 2 '. as follows; 

we retain the first vector (e,' =^ 1 ); 

the second vector is drawn orthogonal to it in the plane passing 
through ei and eof 

the third vector is drawn orthogonal to this plane (Fig. 37). 



When passing to greater dimensionalities, the fourth vector is 
placed perpendicular to the given three-dimensional space, and so 
on. In the general case we set 

'= e\, 

e 2 ' = e 2 +ae,', (9) 

^ 3 ' = ^3 + +yei', 


^h' ^2^(ft-2)'“I" ••• 


From formulas (9) it follows that the vectors Ci' are located in the 
required linear hulls and are nonzero because of the independence 
of the vectors . . . 

It remains now to select the coefficients a, p, ... so that the vec¬ 
tors e,' are pairwise orthogonal. Then the system ey, ..., e^' will 
be independent by the lemma of Subsection 4. 

We find a and have 


whence 


(iV, e^>)=^{e2, e,') = 0 

(^ 2 - “v) 

“ (e,., e,,) 


( 10 ) 


Division is possible sincete,', e,') — {e„ e,) ^0. The vector (—a^i) 
is the orthogonal projection of 62 on L{ei) (Fig. 38). 

We now ensure that the third vector is orihogonal to the first 
two: 

(Cs-, et>) — {e3, g|>)-f P(e2'. gu) + v(g|-, e,^) = 0, 

(fa'. f2') = (fa. f2') + P (f2', f2') + Y(g|'. g2') = 0 
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The underlined terms vanish and {e.^, 62 ') = 7 ^ 0 by construction. 
Therefore we find 


<^ 1 ') p (« 3 - ^ 2 ') 

‘-•l') ’ P (^2'. M 


(H) 


Geometrically, formulas (9) and (II) moan that to construct 
vector ey it is necessary to subtract from vector C 3 its orthogonal 
projection on the subspace L{e\, e^) (Fig. 39). 




From there on the process continues in similar fashion. 

6 . In the orthogonalization process it is often necessary to en¬ 
sure two more supplementary conditions: 

(4) for any the system e/, ..., Sj' is oriented like 

the system e,, ..., e,; 

(5) ||erll=I. 

Formulas (9) guarantee condition (4). Indeed, from (9) we have 
62= — aci' -F e2', , 

6ft = — ^ft-i6|'— ... — 4 -6/j' , 

so that in the matrix expressing Ci in terms of e^', the upper left 
minor of order / (for any / ^ k) is positive (and equal to -f 1). 

To ensure condition (5), it suffices, after carrying through the 
orthogonalization, to divide each of the resulting vectors by its 
norm. 

Remark. It is easy to prove (by induction, for instance) that con¬ 
ditions (1) to (5) given in Subsections 5 and 6 uniquely determine 
a system of vectors e,-, ..., ei,- from the given system e\, ..., eh- 

7. Legendre polynomials. In mathematical analysis and its ap¬ 
plications, one makes use of expansions of arbitrary functions in 
series of given functions, such expansions being viewed like the 
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expansions of vectors in terms of a given basis. It is then con¬ 
venient to have analogues of an orthogonal basis. These are ortho¬ 
gonal systems of functions. An elementary instance of orthogonal 
systems are the Legendre polynomials. 

Introduced in the space of continuous functions on the interval 
[—1, 1] is a quadratic metric with scalar product 

+ I 

U, y)= \ x{t)y{t)dt (12) 

-I 

Accordingly, 

+ 1 

II a: IF = J x^t)dt (13) 

-I 

We have already considered (13) and have demonstrated that this 
is a quadratic form (see Section 4, Chapter IV). Notice that it is 
positive definite: || x IP ^ 0 with || x IP = 0 if and only if the con¬ 
tinuous function x{t) = 0 at all points of the interval. 

Take a sequence of monomials 

\, t, ... (14) 

and apply to it the process of orthogonalization to get the sequence 
of polynomials 

/o(0=l, = = 

... (15) 

The number-labels of the polynomials in (15) are chosen so that 
they coincide with their powers. The coefficients of the polynomials 
are computed from formulas (9) with account taken of (10), (11), 
(12) and (14). 

Following a special normalization like 

Pk{t) = hfk{t) 

where h, are chosen from Ihe condition 

Pa(I)=1 (16) 

we get a sequence of polynomials pk(t) (of degree /e = 0, 1,2 ,...) 
which are called Legendre polynomials. It can be demonstrated 
that 

Taking into account the remark of Subsection 6, it suffices to verify 
that all polynomials (17) are pairwise orthogonal (it is convenient 
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here to make use of integration by parts) and that they satisfy the 
condition (16). 

It can also be proved that 

Thus, the set of Legendre polynomials is orthogonal but not nor* 
malized (the norms of pi, are not equal to unity). 

§ 5. Metric isomorphism 

1, Definition. Two quadratic-metric spaces L and L' are said to 
be metrically isomorphic to one another if there exists a linear iso¬ 
morphism between them under which the scalar product of any 
pair of vectors in L is equal to the scalar product of their images 
in L'. Under this condition, the linear isomorphism is called a 
metric isomorphism (the linear isomorphism is discussed in Sec¬ 
tion 10, Chapter 1). 

Remark. Metrically isomorphic spaces have the same proper¬ 
ties, not only linear but also metric, that is to say, they are based 
on the concept of a scalar product. It therefore suffices to study 
one of a set of metrically isomorphic spaces in order to have a 
knowledge of all of them. 

2. Theorem I. Quadratic-metric spaces with the same dimensions 
and the same positive indices are metrically isomorphic. 

Proof. Let L and L' both be n-dimensional and let them have 
one and the same positive index k {0 ^ k ^ n). By Section 4 we 
can find in L an orthonormal basis e\, ..., and in L' an ortho¬ 
normal basis e,', ..., e„'. These bases have the same number of 
unit vectors (equal to k). We assume that in each of them the 
first k vectors are unit vectors. 

Let X be an arbitrary vector of L. Expand it in terms of the 
basis e\, ..., e^. x = x'ei + • • • + x”e„. To the vector x let there 
be associated a vector x' e L' which has the same components 
relative to the basis e,', ..., e^r. x:' = x'e,'+ ... +x"e„/. We 
have thus established a linear isomorphism between L and L' (see 
Section 10, Chapter I). Now consider two arbitrary vectors x, y of 
L and their images x', y' in L'. Since, relative to the bases 
ei, ..., and e^>, ..., e,,-, the metric forms of the spaces L and 
L' have the same component representations and the components 
of the vectors x, y coincide respectively with the components of the 
vectors x', y', it follows that (x, y) = {x', y'). Thus the linear iso¬ 
morphism established between L and L' is a metric isomorphism. 
The theorem is proved. 

For the quadratic-metric spaces L and L' the following theo¬ 
rem holds true. 
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Theorem 2. If L has dimension n and positive index 
k (0 ^ n) and L' is metrically isomorphic to L, then L' also 

has dimension n and positive index k. 

Proof. Let e\ . en be an orthonormal basis in L with the 

first k vectors being unit vectors. Let the vectors e^■, .... e„' e L' 
correspond to the vectors et, .... e„ by an isomorphism. Since a 
metric isomorphism is a linear isomorphism, by repeating the proof 
of Theorem 2, Section 10, Chapter I, we find that the dimension of 
L' is equal to n and that e,', . . ., constitute a basis in L'. It fol¬ 
lows immediately from the definition of a metric isomorphism that 
the basis e,', .... e L' is orthonormal and that the first k vec¬ 
tors in it (and only those vectors) are unit vectors. The proof of 
the theorem is complete. 

Corollary. Quasi-Euclidean spaces with different dimensions or 
with different positive indices are isomorphic. A Euclidean space 
is not isomorphic to any quasi-Euclidean space. 

Remark. However, there is no need to make a separate study 
of rt-dimensional quadratic-metric spaces with positive indices k 
and n — k. It is sufficient to change the sign of the metriQ»form in 
one of them in order to obtain another. 

§ 6. ^-orthogonal matrices and ^-orthogonal groups 

I. We consider an n-dimensional quadratic-metric space with a 
given positive index k {0 ^ k ^ n). In this space take any two 

bases Cj, ..., and cu.e„', provided that they are both 

orthonormal and the first k vectors in each are unit vectors. 

By the usual procedure write down the change-of-basis formulas 
from first to second: 

et'='EP'rei ( 1 ) 

The matrix P made up of the coefficients of these formulas is of 
a special nature in this case. To figure it out, write down the 
metric form of the space relative to the basis ei, ..., e„: 

||,v||-’ = (.v')2+ ... -L(x'')'^-(.v*+')2- ... -{x'^r 
Along with this form consider the matrix 

II' 01 


1 



( 1 ) 
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in which we have +1 in the first k places on the main diagonal 
and —1 in the remaining places of the main diagonal with zeros 
elsewhere. Clearly, the matrix G of the metric form relative to 
the basis ei.coincides with matrix Ek’. 

G = E, 

By virtue of our conditions concerning the bases at hand, the 

metric form in the basis e,'.has exactly the same form (i) 

as in the basis e\, ..., Cn. And so matrix G' of the metric form in 
the basis e,', ..., e„' is also equal to Eh'. 

G' = Eh 

On the other hand, by the general law for transformation of the 
matrix of a quadratic form we have 

G' = PGP' 

We thus conclude that if P is a change-of-basis matrix from one 
orthonormal basis to another, then 

PEhP' = Eh ( 2 ) 

It is essential to note that in both bases it is precisely the first k 
vectors that are unit vectors, all other vectors being imaginary- 
unit ones. 

It is easy to see that the converse has been proved at the same 
time: if the matrix P satisfies condition (2), if the original basis 
ei, ..., Cn is orthonormal, and if the first k vectors in it are unit 
vectors (the remaining being imaginary-unit vectors), then the 
basis e,', ..., e„' obtained by (I) will also be orthonormal and 
its first k vectors will also be unit vectors. 

2. Definition. Any n X « matrix P satisfying condition (2) is 
termed a k-orthogonal {Q ^ k ^n) matrix. 

Note that this definition is of a purely algebraic nature. It could 
be given quite apart from the geometry of quadratic-metric spaces. 

3. All fe-orthogonal matrices are nonsingular. Indeed, it is ob¬ 
vious that det fft = ± I. 

From this and from (2) 

detP-detP*=l (3) 

Hence det P ^ 0. And from (3) it is also clear that det P = ± 1. 

4. Because of (2) we have P'‘P;, = EhP* or (since EhEh = E) 

P-' = EhP^Eh (4) 
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We see that the matrix-inversion operation, whicli in the general 
case is a cumbersome one, reduces, for /e-orthogonal matrices, to 
the operation of taking the transpose and multiplying by (the 
latter merely denotes a change in the sign of certain elements). 

5. Theorem, k-orthogonal matrices constitute a subgroup of the 
group of all nonsingular n y_ n matrices. 

We denote the subgroup by 0* and will speak of the /s-ortho- 
gonal subgroup (or group). 

Proof. For the time being let O;, merely denote the set of all 
A:-orthogonal matrices. From 0* take any two matrices Pi, 

P 2 ; then PiEkP' = Ek, PiEkP 2 = Ek, whence 

(P1P2) Ek {PiP^r = Pi {P2EkP2) PI = P.FftPI = Ek 


Thus if P, e Oft and Pj^Oft, then P|P 2 eOft. • 

From Oft take an arbitrary matrix P; then PEkP* = Ek. From 
this and because of (4) 

p-'Ek (p-y = p-'Ek (£ftP*Pft)‘ = p-'Fft (£ftP£ft) = P“ EftP£ft = £ft 

Thus if P e Oft, then P"‘ e Oft and the proof is complete. 

Remark. By Subsection 3, all Ou {0 ^ k ^ n) lie in the sub¬ 
group of n X matrices with unit modulus of the determinant. 

6. The set of all orthonormal bases in quadratic-metric space 
with a given positive index is nothing but a class of bases defined 
with respect to the group Oft by some one orthonormal basis of 
the space (see Section 1, Chapter VI). 

The geometry of quadratic-metric space has as its object the 
study of invariants with respect to the group Oft in the class of 
orthonormal bases. We have in mind here invariants in the broad 
sense of the word; namely, not only invariant numerical quantities 
(like, say, scalar products, the norm of a vector), but also inva¬ 
riant objects (say, planes) and invariant relations (for instance, 
the orthogonality relation). 

7. Observe that any class of bases relative to the group Oft is a 
class of orthonormal bases in some (quite definite) quadratic 
metric. 

Indeed, let ei, ..., e„ be an arbitrary basis of a linear space L. 
By Subsection 2, Section 3, there exists a (quite definite) quadratic 
metric in which the basis ei, ..., is orthonormal and has the 
first k vectors for unit vectors. Then the class of bases defined with 
respect to the group Oft by the basis ei, ..., e„ will consist of 
orthonormal bases in precisely this metric. 
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8. Conclusion. Thus the set of all bases of an n*dimensional 
linear space splits up into classes with respect to the group 0 ^ so 
that to each class there corresponds a specific quadratic metric in 
which the bases of this class are orthonormal. 

At the same time there is defined an infinite set of quadratic- 
metric spaces on one and the same, to put it pictorially, linear 
“skeleton” L. They are all metrically isomorphic to one another. 
The geometries of these spaces are algebraically identical since 
they all have as their object of study the invariants of the 
group Ok. However, from the viewpoint of the linear space L, these 
quadratic-metric spaces are distinct for the reason that one and 
same pair of vectors x, y in L have in them distinct numerical va¬ 
lues of the scalar product. This will all shortly be explained in 
examples (see Sections 7 and 8). 

9. Assume that the quadratic metric has been chosen. Let the 
^-orthogonal matrix P define a transition from the orthonormal 
basis e\, ..., e„ to the orthonormal basis Ci', ..., e,/. Also con¬ 
sider the corresponding transformation of components (coordina¬ 
tes): 

x'' = Z Qix‘ (5) 

The matrix of this transformation is Q =(P*)-'. To see that the 
matrix Q is also ^-orthogonal, consider the following chain of 
equations: 

QEkQ* = (P*)-' EkP~' = {PEkPT' = Pa ' = Ek 

We obtain the relation QEkQ* = Ek, which establishes the k-or- 
thogonality of Q. 

Thus, in changing from one orthonormal basis to another one, 
the components of an arbitrary vector are subjected (as variables) 
to a linear transformation with a /f-orthogonal matrix. 

Remark. The linear transformations (5) of the variables 
x\ ..., x” to the variables x'', ..., x"' with the ^-orthogonal 
matrix Q may be characterized without resorting to a transforma¬ 
tion of bases and, accordingly, without resorting to matrix P. It 
is precisely such linear transformations, and only such transforma¬ 
tions, that preserve the normal form of a quadratic form. In other 
words, if matrix Q is ^-orthogonal (and only in this case), then we 
have the identity 

{x'J -f ... + (x^y - (:e<‘+ - ... - (x'*')2 

= (x')2 4- ... +(x*)2-(x<‘+'))2- ... -(x")^ 


in the left-hand member of which x'', .... x"' are expressed by 
formulas (5). 
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10. From the foregoing it is clear that the set of all linear trans¬ 
formations of variables with ft-orthogonal matrices constitutes a 
group that is isomorphic to the group 0*; the isomorphism here 
can be a mapping which to the matrix P e Oj, associates a linear 
transformation with matrix Q =(P*)~K This is evident from form¬ 
al operations with matrices. If P|, Pi^Ok and Qi = (P;)”', Q> = 
= {PT\ then Q Q, = (PI)-'(PJr' = (P5Pt)-' = ((P,P,)•)-'. We see 
that to a product of matrices Pi, Pa corresponds a product 
of their images, which serves as the condition of an isomorphism. 

§ 7. The group of Euclidean rotations 

1. In the two-dimensional case there are two metrically isomor-* 
phic spaces with positive index k = 1 and k — 2 respectively. 

\l k = 2, then the metric form relative to an orthonormal basis 
is 

IUIP-(a:')2 + (a:2)2 (1) 

To it there corresponds the geometry of the ordinary Euclidean 
plane (where the scalar product is given by the familiar formula 
{x, y) = x'i/‘ -f xy, where the angle between vectors is defined, 
where the trigonometric functions of angles are given in terms of 
coordinates by the familiar formulas of elementary analytic geo¬ 
metry, and so on). If = 1, then 

IUlP=(x')2-{x2)2 (2) 

The metric form (2) is associated with a two-dimensional Min¬ 
kowski geometry. 

2. Let us investigate form (1). We begin with a consideration 
of ^-orthogonal matrices. Incidentally, note from the very start 
that for n = 2, k — 2 (and in general for k — n) ^-orthogonal 
matrices are simply called orthogonal matrices. 

When n = 2 and k — 2, we have Eh = E. For this reason, in 
the given special case, the general condition PEuP* = Eh for 
^-orthogonality of matrix P assumes the form: PP* = E. 

Let 



From whac has been said, this matrix is orthogonal if and only if 

a P ay 10 
Y 6 p 6 ^ 0 1 
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whence 

+ aY + p6 = 0, 'I 

Ya+6p = 0, yH6^=1 ) 


(3) 


Here we have three different equations. The system is simple and 
there is no difficulty in finding all solutions. From the second 
equation of (3) we can write y = — f> = +A,a, where X is a 
new unknown. Substituting these expressions into the last equa¬ 
tion of the system, we get 

y 2 4 62 = ^2(a2-f p2) = ;,2= 1 


Thus, X = ± 1. To determine the geometric meaning of choice of 
sign, compute the determinant of matrix P: 


detP = 



P 

A,a 


= ;,(a2 + p2) = A, 


Consequently, to the values A, = ± 1 correspond transformations 
of the basis with orientation preserved or disrupted, respectively. 

Now lake advantage of the equation + P^ = 1- Due to this 
equation, a = cos 0, p = sin 0, where 0 is an arbitrary parameter. 
At the same time we have y = —K sin 0, 6 = X. cos 0. We have thus 
found all solutions of system (3) and, respectively, all orthogonal 
matrices (only for n = 2 of course). Let us confine ourselves to 
X = 1. Then 


COS0 

sin 0 

— sin 0 

COS0 


Formula (4) yields all orthogonal matrices for which det P > 0 
(that is, det P = -)- 1). It is easy to see that by themselves they 
constitute a group (a subgroup of the orthogonal group). This 
is also evident from the following two equations: 


cos 0 , 

sin 0 , 

cos 02 

sin 02 

— sin 0 , 

COS 0 | 

— sin 02 

cos 02 


cos (0| -f Oj) sin (0| + 02 ) 
-sin (0,4 0.) cos(0, + 02) ’ 


COS0 sin0 

-1 

cos (— 0) 

sin (— 0) 

— sin 0 COS 0 


— sin (— 0) 

COS (— 0) 


They can readily be verified and they express the fact that the pro¬ 
duct of matrices like (4) and the inversion of a matrix like (4) 
yield matrices of the same type. 



272 


SPACES WITH QUADRATIC METRIC 


[CH. VIII 


3. Let US take an orthonormal basis Ci, on the Euclidean 
plane. Referring to Fig. 40 we see that the orthogonal vectors eu ea 
issue from the origin and terminate on the unit circle (jc')* + 

+ (.v2)2 = 1. 

Let us pass to a new basis via matrix P of type (4): 

ei' = e,cos6 +e2sin0, 1 
62 ’ = — e, sin 9 + 62 cos 0 J 

By virtue of the first of these equations, the vector ey is a unit 
vector and forms with vector ei an angle 0 with the ordinary con¬ 



dition for orientation of angles (i.e., taking into account sign, as 
is usual in trigonometry). The second equation can be written as 

e. 2 ' = e, cos (0 + y) + ej sin ^0 -f y) 

From this it follows that the vector 62 ', being a unit vector, forms 
with the vector ei an angle of 0 Hence with the vector 62 it 

forms the same angle 0 as the vector ey does with the vector ei. 
In other words, the basis ey, 62 ' results from a rotation of the 
basis ei, 62 through the angle 0. 

Thus, the orthonormal basis e\, 62 , taken arbitrarily in a Eucli¬ 
dean plane, defines, with respect to the group of matrices (4), a 
class of bases that result from a rotation of the basis e\, 62 through 
ail possible angles. They are all orthonormal and identically orient¬ 
ed with the basis e\, 62 . 

Remark. In order to obtain a class of bases with respect to the 
entire orthogonal group by proceeding from the basis ei, 62 , it is 
necessary to make the additional construction of a class of bases 
with respect to the group of matrices (4) by taking eu —62 for the 
original basis. 

4. To a transformation of the basis via matrix P there corres¬ 
ponds a transformation of coordinates with matrix Q =(P*)-'. In 
this ease it is PP* = E, whence Q = P. Hence if the basis trans- 



§ r\ THE GROUP OF EUCLIDEAN ROTATIONS 273 

forms via formulas (5), then the components of an arbitrary vector 
transform via the formulas with the same matrix: 

jc‘' = .t'cos 0 +jc2sin0, | 

= — jc'sin0 + x*cos 0 ) 

5. We have just considered relations (6) as formulas of trans¬ 
formation of the components of the given vector x = x'e\ + x^e^ 
under a rotation of the basis e\, e^, (Fig. 41a). These same formulas 
may be viewed from a different standpoint. Namely, we can assume 
that the basis e\, does not change and that formulas (6) asso- 




(b) 

Fig. 41 

ciate with the arbitrary vector x = x'ci -f x^e 2 a new vector 
x'= x''ei + x^'e 2 . In this sense, the formulas (6) constitute a 
component (coordiriate) representation, relative to the basis ei, 62 , 
of a linear transformation of the Euclidean plane. Let us denote it 
by Iq. By formula (6), the vector x' = lex has the same norm as 
the vector x and is obtained via a rotation of x through the angle 
(—0) (Fig. 416). Since the angle 0 is the same for all vectors, 
they all rotate in the same fashion under the linear transformation 
x' — Iqx. For this reason, the linear transformation /e is called a 
rotation of the Euclidean plane through the angle (—0). 

The set of all rotations (that is, through all possible angles) 
constitutes the group of rotations of the Euclidean plane. It is iso¬ 
morphic to the group of matrices of type (4), which for this reason 
is also called a rotation group. 

6. A transformation that preserves the metric of the space is 
said to be isometric. 

We now confine ourselves to some examples and leave a closer 
study of isometric transformations to the next chapter. 

7. Under any rotation x' = /o-v the metric properties of the 
images coincide with the metric properties of the inverse images 





274 


SPACES WITH QUADRATIC METRIC 


(CH. VIII 


(the norm of an image is equal to the norm of the inverse image: 
II x' II = II a: II; a scalar product of images is equal to the scalar 
product of the inverse images: (ac', y') = (x, y)). Therefore every 
rotation /o is an isometric transformation. Also included in iso¬ 
metric transformations of the Euclidean plane are reflections about 
a straight line; for example, the transformation x'’ = x\ x’' = — x^. 

Proof is given below (see Sections 7, 8, Chapter IX) that an 
arbitrary isometric transformation on a two-dimensional Euclidean 
plane is determined relative to an orthonormal basis by a compo¬ 
nent representation with an arbitrary orthogonal matrix |with de¬ 
terminant of any sign). It is either a rotation of the plane through 
an angle, or a reflection, or the composition of a reflection followed 
by a rotation. 

Remark. We disregard parallel translations of the Euclidean 
plane that displace the origin of coordinates because for the pre¬ 
sent we view the Euclidean plane as a vector space. 

8. Considered in the geometry of the Euclidean plane are the in¬ 

variants of the orthogonal group. Here the fact of invariance is of 
a purely algebraic nature. For example, the invariance of the 
norms of vectors means the identity (x'')^ + (^ 1 ^^")^ = as 

a consequence of formulas (6) or the formulas a:''= a;', x "^’ ■= — x ^. 
From the geometrical standpoint, two views are possible. 
If the orthogonal group is regarded as a group generating the 
class of orthonormal bases, then invariance under this group 
signifies equivalence of such bases. If the orthogonal group is 
viewed as a group generating isometric linear transformations, 
then invariance under this group signifies preservation of the 
metric properties of the figures (systems of vectors) under rota¬ 
tions and reflections. 

9. We have already pointed out that in one and the same linear 
space it is possible to introduce a metric in different ways by tak¬ 
ing distinct bilinear forms for the scalar product. Let us illustrate 
this fact using the Euclidean plane. 

Elementary geometry states that in the Euclidean plane it is 
possible to compare lengths and to measure angles. Let a scale 
unit be chosen and the unit vectors a\, aj (orthogonal from the 
elementary viewpoint) be taken for a basis. Then we can introduce 
the scalar product (x, y) and put 

(ac, y) = x'y' 4- xY (7) 

where .v = a'qi + x^a 2 , y — y'a\ + By Sections 1, 2 we can 
define the lengths of all vectors and the concept of orthogonality, 
and, by a familiar formula of elementary analytic geometry, it 
is possible to express the angle between two vectors in terms of 



§7] 


THE GROUP OF EUCLIDEAN ROTATIONS 


275 


their lengths and the scalar product. Then the lengths and angles 
determined via the scalar product (7), which is given relative to 
the basis Oi, Oa, will coincide with the lengths and angles deter¬ 
mined in elementary plane geometry. 

Now, on this same plane, we will examine a geometry that is 
introduced artificially in addition to the natural geometry. To do 
so, we take some nonorthogonal basis c\, in addition to the 
basis oi, 02 . In Fig. 42 we have, for convenience, taken 62 as a 
unit vector and ei as a vector exceeding unity in length and ortho¬ 
gonal to the vector 62 . Proceeding from this basis, we construct a 
class of bases with respect to the group of matrices (4), that is to 




say, by formulas (5). In the plane we now introduce a new quad¬ 
ratic metric by determining the scalar product by the same for¬ 
mula (7), but this time we will assume that x', y\ are the 
components of the vectors x, y relative to the basis ei, ^2 shown in 
Fig. 42. In the new metric, the vectors e\, C 2 ure orthogonal and 
have unit lengths; the basiser, e 2 'defined by formulas (5) is also 
orthonormal for any value of 0. In short, in the new metric 
everything up to this point is repeated. But from the standpoint of 
the old metric, everything is depicted in a distorted manner. For 
example, a unit circle which in the basis e\, 62 of Fig. 42 is given 
by the equation (x')2 + (x2)2 = 1 is an ellipse in the sense of the 
old metric. An arbitrary orthonormal basis defined by formulas (5) 
is composed of the vectors e\’, er, which in the old metric are not 
orthogonal and lie on two conjugate diameters of the ellipse 
+ (x:2)^= 1. To see the truth of these remarks, it suffices to 
establish a metric isomorphism between the Euclidean plane with 
its original metric and the same plane with its new metric. By 
Section 5 (see the proof of Theorem 1), we obtain a metric iso¬ 
morphism if we establish a linear isomorphism in which the bases 
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depicted in Figs. 40 and 42 correspond to each other. For the 
sake of pictorialness, we assume that the Euclidean plane is in the 
form of two copies, P\ and P 2 , corresponding to Fig. 40 and 
Fig. 42. Situate P| and P 2 in three-dimensional Euclidean space as 
they are shown in Fig. 43. That is, bring to coincidence the vectors 
denoted by 62 in planes Pi and P 2 (this can be done since we took 
them to be unit vectors); then rotate P 2 about 62 bringing it to a 
position in which the endpoints of the vectors denoted by ei lie on 
a single perpendicular to the plane P\. Now, with each*vector x 
of P\ we associate a vector x' of P 2 that projects orthogonally on 


A' 



Fig. 44 


P\ into vector x. This correspondence is clearly a linear isomor¬ 
phism; at the same time it is a metric isomorphism since to the 
orthonormal basis of plane P\ corresponds a basis of plane P 2 
that is orthonormal in its new metric. 

From our construction we immediately perceive the truth of the 
foregoing remarks. Namely, that the unit circle in the new metric 
of plane P 2 is an ellipse in the old metric; that the bases which are 
orthonormal in the new metric are composed of vectors that lie 
along conjugate diameters of the ellipse. For the new metric pro¬ 
perties of vectors in P 2 we take the properties of their inverse 
images in Pi (namely, for the norm of a vector in P 2 we take the 
norm of its inverse image in Pi; for the scalar product of two 
vectors in P 2 we take the scalar product of their inverse images in 
Pi). Note, in particular, that the linear transformation /e, which, in 
the basis t'l, 1-2 of P 2 , has the component representation (6), is a 
rotation of the plane P 2 in the .sense of the new metric, whereas in 
the original metric this transformation is the so-called elliptic ro¬ 
tation of the Euclidean plane. The name is related to the fact that 
if the parameter 0 changes, then the image x' = Iqx of a fixed 
vector X describes with its endpoint an ellipse passing through the 
end of vector jc. To distinct vectors x, y, z correspond ellipses that 
are simitar and similarly situated (see Fig. 44; all this is easy to 
grasp if we revert to Fig. 43). Fig. 44 depicts two hatched figures, 
one of which is carried into the other by an elliptic rotation. In 
the geometry that we artificially introduced in the plane, these two 
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figures are to be regarded as identical (congruent), that is to say, 
superimposable. 

10 . Tbe basis ei, could have been taken quite arbitrarily and, 
assuming that x = x'ei + x‘^e 2 , y = y'e^ -f y'^e 2 , we could introduce 
the scalar product by the formula 

•2 

U. y) = g ( v, f/) = H dux'y' {'8) 

i. /=! 

where the bilinear form g{x, y) is chosen at pleasure, so long as 
the quadratic form g{x, x) is positive definite. From Section 5 it 
follows that we will obtain a two-dimensional space metrically iso¬ 
morphic to the Euclidean plane. Taking advantage of the positive 
definiteness of g{x, x), it is easy to prove that circles, that is, the 
curves 

II ^ IP = g U. •*=■) = constant 

on the plane with the scalar product (8) are, from the elementary 
point of view, ellipses. 

We thus see that on one and the same plane it is possible to 
specify an infinite number of distinct Euclidean metrics. To picture 
this more vividly, notice that any ellipse with centre at the zero 
point is a unit circle in some one (quite definite) Euclidean metric. 
Hence, there are as many Euclidean metrics on the plane as there 
are distinct ellipses with common centre. 

11. Quite naturally, in linear spaces of greater dimensionality 
it is also possible to introduce different metrics that are isomorphic 
to one another. Thus, for instance, in the space of functions con¬ 
tinuous on the interval [—1, 1] we can introduce the scalar pro¬ 
duct 

{x,y)= ^(f>{t)x{i)y(t)dt (9) 

-I 

where (p(/) is an arbitrarily chosen positive continuous function. 
Then in place of formula (13), Section 4. we have 

1 

IU(/)IP= \cf{l)xHl)dt 


The resulting space is metrically i.somorphic to the space of conti¬ 
nuous functions with scalar product (12), Section 4, specified on 
[—I, 1]. The metric isomorphism between them is, for instance, the 
map carrying x{t) into x(t)^/lfl{^) . 
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Note in passing that the pairs of functions for which the scalar 
product (9) vanishes are called orthogonal with weight (p(t) on the 
interval [—1, 1]. 

§ 8. The group of hyperbolic rotations 

1. We now turn to the (two-dimensional) geometr^ of Min¬ 
kowski. We will carry out all constructions in the ordinary Eucli¬ 
dean plane. In it we take an orthonormal basis ei, e< 2 , and introduce 
the Minkowski metric with the aid of the quadratic form 

\\xf = {x'f-{xr ( 1 ) 

relative to the basis e\, e^. Accordingly we have a formula for the 
scalar product 

{x, y) = jc'y' — JcV (2) 

In this metric, || ||2 = 1, || gj |P = —1, (ei, e2)=0. Thus, the 

basis ei, is also orthonormal in the metric (1); here, e\ is a unit 
vector and 62 is an imaginary-unit vector. 

In order to get a feeling of the peculiarity of the metric (1), it 
is best to begin with a consideration of the unit circle. This is the 
term used to describe the locus of endpoints of all possible vectors 
whose norms are equal to unity in absolute value (we assume that 
all vectors issue from the zero point). In the given basis, the unit 
circle is defined by the equation |(a:‘)^ —(x2)2|= i, whence either 
(x')2—(Ar2)2= 1 or (a:‘)2 — {x^y= — 1. In Euclidean geometry, 
these two equations define, relative to the basis e\, 62 , conjugate 
equilateral hyperbolas whose common asymptotes are the bisectors 
of the quadrantal angles. Thus, in the Minkowski metric, the unit 
circle consists of two Euclidean hyperbolas, on one of which lie the 
endpoints of the unit vectors and on the other the endpoints of the 
imaginary-unit vectors (Fig. 45). In contrast to this terminology, 
some use the term unit circle to mean only the first of the hyper- 
bolas, tlio other being called the imaginary-unit circle. 

Consider an arbitrary vector x = x'C] + x^e 2 extending along 
one of llie asymptotes of these hyperbolas; in this case |jc'| = |x*| 
and therefore 1| x P = 0. Thus, on the asymptotes lie isotropic vec¬ 
tors, that is, vectors with zero norm. 

2. Important remark. The triangle inequality does not hold true 
in the Minkowski plane. This is immediately evident in the case of 
triangle OAB (Fig. 46), where the vectors OA and AB are iso¬ 
tropic (parallel to the asymptotes of the hyperbolas). If we denote 
OA = X, AB = y, then 

IU + '/ll>IUII + lli/il=o 
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or, in terms of distances, 

p(0, B)>p(0, ^) + p(^, 5) = 0 

It can be demonstrated that in any quasi-Euclidean space there 
are three points for which the triangle axiom does not hold true. 
We leave the proof to the reader. 

3. Let X, y be two nonisotropic vectors. Suppose they are per¬ 
pendicular to one another in the sense of Minkowski; we shall at¬ 
tempt to describe what their perpendicularity means from the Eucli¬ 
dean standpoint. From (2) we have (jc, y) = x'y' — x^y^ = 0; then 
((x:')2— (x^)^) • ({y')^—ly^)^) = — {x^y^ — 2 , whence it follows 



Fig. 45 Fig. 46 


that \\ x \\^ —{x')^ —(x^)^ and II IP — are numbers 

of distinct signs. Hence, if one of the vectors x, y has a real norm 
in the Minkowski metric, then the other vector has an imaginary 
norm; in the Euclidean sense, this means that the vectors x, y or 
their extensions intersect different hyperbolas (x')* — (x2)2= ± |. 
Since we are now interested only in the directions of the vectors 
X, y, we can, without any loss of generality, assume that their end¬ 
points lie on the unit circle of the Minkowski metric. For example, 
let {x'y—(x"^)^ — \, Then if (x, y) = 

= x'y' —xy = 0, it will follow that (//' — — (t/ — x^y = 0; 

conversely, from the latter relation it follows that {x, y)= 0. But 
the equation {y' — x^y — {y^ — x^)^—0 means that the difference 
y — jc of the vectors x, y is an isotropic vector, which means that 
it is directed along some quadrantal bisector. But this is equivalent 
to the vectors x, y being symmetric with respect to another bisec¬ 
tor. 
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To summarize, then, the vectors x, y are orthogonal to each other 
in the sense of Minkowski if and only if in the Euclidean sense 
they arc located on rays that are symmetric relative to on^of the 
quadrantal bisectors (Fig. 47). 

Observe that the isotropic vector x (lying on the bisector) is 
orthogonal to itself: {x, a:) = || a: |p = 0. 

4. The material presented in Subsection 3 permits giving a 
Euclidean description of all bases orthonormal in the Minkowski 
metric under consideration. Namely, the arbitrary orthonormal ba¬ 




sis is composed of the vectors e\', er, the endpoints of which lie 
on the hyperbolas (a:')^—(a:*)^ = ± i symmetrically relative to 
one of their (common) asymptotes (Fig. 48). 


5. Now consider A-orthogonal matrices (/r = 2, ft = 1) corres¬ 
ponding to a two-dimensional Minkowski metric. Write down any 

la p " 

one of the matrices in the form P = 


ft-orthogonality we have in this case 


Y 6 


By the definition of 


whence 


(I p 


1 0 


a 

Y 

1 

1 

0 

Y 6 


0 -1 


P 

6 

1 

0 

-1 


a2-p^=l, 
ya — 6p = 0, 


ay — P6: 

y2_62: 




( 3 ) 


We find the general solution of this system (three equations). Be¬ 
cause of the second equation of (3) we can write 


Y = xp, b—'Ko, 


( 4 ) 
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where X is the new unknown. Substituting these expressions into 
the last equation, we get 

_ 62 = _ a2) = - >,2= - 1 

Thus, A = ± 1. On the other hand. 

Hence, to the values Ji = db 1 there correspond transformations 
of the basis with orientation preserved or disrupted. 

Now make use of the equation = 1- Its general solution 

is of the form 

a=±cosh6, p = rbsinh0 (5) 

where 6 is an arbitrary parameter. However we can assume that 
if a > 0, then 

a = cosh0, P = sinh0, — oo < 0 <-1-<» (5a) 

and if a < 0, then 

a= —cosh0, p = —sinh0, — oo < 0 < + oo (5b) 

since the remaining cases of (5) reduce to (5a) or (5b) via a re¬ 
placement of 0 by —0. 

Noting that A = ± 1, we see that formulas (4), (5a) and (5b) 
yield all the solutions of (3). We have thus found all the ^-ortho¬ 
gonal matrices for n = 2, ^ = 1. 

By subjecting the original basis e\, to a transformation with 
an arbitrary A-orthogonal matrix, we obtain a new basis ei', er- 

ei' = aei+pe2, ) 

^2' = Y^i + 5^2 J 

whose vector e\- is a unit vector: 

lkulP = a=*-p2=+l 

and the vector er is an imaginary-unit vector: 

l|e2'||- = Y'-’-6' = -I 

from which it is evident that the transformation (6) cannot carry 
a vector whose endpoint lies on one of the hyperbolas 
(x:')2—(jc2)2 = ± 1 into a vector with endpoint on the other hyper¬ 
bola. 

6, We now elucidate the Euclidean geometric meaning of the 
conditions a > 0 and a < 0. Since the basis ei, 62 is orthonormal 
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in the Euclidean metric of the plane, it follows that, taking Eucli¬ 
dean scalar products, we get 


(er, ei) = a(ei, ei)-f p(e2, ei) = a (7) 

Thus, for a > 0 the vectors er and ei constitute an acute angle, 
for a < 0, an obtuse angle. We then conclude that if a > 0, then 
the vector has its endpoint on the same branch of the hyperbola 
(Ar‘)2— (x^)^ = 1 as vector ei (the right branch in Fig. 48); if 
a < 0, then the endpoints of the vectors ei, ei' lie on different bran¬ 
ches of this hyperbola. 

7. As in (7) we can express p, 6 in terms of the Euclidean 
scalar products of the vectors: 


P = (er, 62 ), V = ( 62 ', e,), 6 = (er, 62 ) 


Figs. 49 and 50 show different arrangements of basis e\', e^' de¬ 
pending on the signs of X, a, and p. 

8. Now consider the matrix P for A, = -f- 1, a > 0. By the 
foregoing, 

cosh 6 sinh 0 
^ sinh 0 cosh0 


In making the transition from an orthonormal (in the Minkowski 
metric) basis ei, 62 to a new basis via matrix P of the form (8) we 
obtain an orthonormal basis er, 62 ' with the same orientation as 
ei, 62 , besides, the terminal points of the vectors er, 62 ’ lie on the 
same branches of the hyperbolas (x ')^—(■^^)^= ± 1 as do the ter¬ 
minal points of the corresponding vectors e\, 62 - Matrices of the 
type (8) play the same role in two-dimensional Minkowski geo¬ 
metry as the matrices (4), Section 7, do in two-dimensional Eucli¬ 
dean geometry. 

9. Matrices of type (8) constitute a subgroup of the whole ^-or¬ 
thogonal group for A: = 1, n = 2. Indeed, we have the equations 


cosh0| sinh 0| 
sinh 0| cosh0i 


cosh 02 sinh 02 
sinh 02 cosh02 


cosh(01-1-02) sinh(0|-f02) 
sinh (0|-(-02) cosh(0|-f02) 


cosh 0 

sinh 0 

-1 

cosh (— 0) 

sinh (— 0) 

sinh 0 

cosh 0 


sinh (— 0) 

cosh (— 0) 


These equations are readily established and they signify that the 
product of matrices of type (8) and the inversion of a matrix of 
type (8) lead to matrices of the same type. 
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10. To a transformation of the basis via matrix P corresponds 
a transformation of components with matrix Q=(P*)"‘: 

x''— x'cosh 0 — sinh 0, ■> * 

= — x' sinh 0 + x^ cosh 0 J 

On the other hand, as in Subsection 5, Section 7, we can consider 
that the basis e\, does not change but that the formulas (9) 
associate to an arbitrary vector x = x'ei -{- x^e 2 a new vector 
x'= x‘'e| + x*'e2- In this sense, formulas (9) constitute a compo¬ 
nent representation relative to the basis eu 62 of some linear trans¬ 
formation of the plane. We denote it by He. Relative to the Min¬ 
kowski metric, this is an isometric transformation similar to the 



Fig. 51 


rotation x' = lex of the plane with a Euclidean metric. For this 
reason, the transformation x' = Hex is called a hyperbolic rotation 
of the plane. The term “hyperbolic” is due to the fact that for a 
fixed X and varying 0 the terminal point of the vector x' = Hex 
slides along the hyperbola (x')^—(x2)2= II x |F (Fig. 51). 

Let an arbitrary geometrical figure W be given in the plane. The 
hyperbolic rotation He carries it into a new figure W. It is taken, 
by definition, that the figures W and W are congruent in the Min¬ 
kowski metric. In the Euclidean metric they are, generally speak¬ 
ing, not congruent. An elementary example is shown in Fig. 52, 
where the figure and its image are designated by W and W and 
are shown hatched. 

II. We will now see that the analogy between a hyperbolic and 
an ordinary (Euclidean) rotation is very far-reaching. With this 
purpose in mind, we now decipher the geometric significance of the 
parameter 0 in the hyperbolic rotation x' = //ex. For the sake of 
simplicity, suppose that x= {x‘, x^} is a unit vector, that is, its 
terminus lies on the hyperbola (x')* — (x^y — 1. Let 5(0) be the 
area of the curvilinear triangle bounded by the vectors x, x' = Hex 
and by the arc of the hyperbola between their termini (Fig. 53). 
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We will say that S > 0 if the rotation from x to x' is counterclock¬ 
wise, and S < 0 otherwise. Lei x" = H^^qx', AS be the increment 
in area 5 (0) when passing from x' to x", and Ao the oriented area 



Fig. 52 Fig. 63 

of the parallelogram constructed on the vectors x' and x" (Fig. 53). 
Then 

Ao = ^ 2 ' = — (x'')"SinhA0-f (•<^')^sinhA6 = — sinhA0 

Here we make use of formulas (9) with 0 replaced by A0 and the 
equation of the hyperbola {x''f — (x^y— 1. On the other hand, 

AS «Ao = — sinh A0 « — A0 

where the approximate equations occur up to quantities of higher 
order relative to A0, whence 

d5 = --jrf0 

Thus, if we take into account that S = 0 for 0 = 0, then we get 
the equation 

0 = - 2S 

Note that when an ordinary (Euclidean) rotation of the plane takes 
place through the angle 0, then the terminus of every unit vector 
slides along the unit circle and the angle 0 is equal in absolute 
value to twice the area swept out by the turning vector. As can be 
seen from the calculations just made, a similar situation arises in 
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a hyperbolic rotation. The terminus of the unit (or imaginary-unit) 
vector slides along one of the hyperbolas (x ')^—± 1 gnd 
sweeps out an area equal in absolute value to one half the “hyper¬ 
bolic angle”. 

12. It is immediately apparent from (9) that the hyperbolic ro¬ 
tation has eigenvectors directed along (common) asymptotes of 
the hyperbolas (a:')* —(x*)*= ± 1. Indeed, if the vector x = 


Fig. 54 


= {x', x^} lies on the first asymptote, it follows that x' = x^ and 
then x'' = x^'. Thus, for the first asymptote, x' = Hqx = "hix, 
where, as is readily seen, 

^1 = cosh 0 — sinh 0 

Similarly for the second asymptote x — {jc', —x^} and x' = Hex == 
= 'k 2 X, where 

A ,2 = cosh 0 -f sinh 0 
An essential point is that 

A.iA,2 — 1 (10) 

For 0 > 0 we have Xi < 1, X ,2 > 1; in this case the plane shrinks 
Xi times to the straight line Xi = —X\ and, due to (10), stretches 
the same number of times in the orthogonal direction away from 
the straight line x^ = x\ as shown in Fig. 54 for a number of 
points. Here, all the points of the plane not lying on invariant 
straight lines = ± x' slide along the hyperbolas (a-')^— {x^y^= 
= constant, which is clearly evident (because of (10)) irrespective 
of the arguments of the preceding subsections. 
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If 0 < 0, then the directions of stretching and compression 
change places as compared with the case 0 > 0, and the direction 
of motion of the points along the hyperbolas is reversed. 

13. Note in conclusion that on one and the same Euclidean plane 
it is possible to specify an infinite number of distinct Minkowski 
metrics. To each there corresponds its own pair of conjugate hyper¬ 
bolas for the unit circle; conversely, any pair of conjugate hyper¬ 
bolas serves as a unit circle for some (quite definite) Minkowski 
metric. 

If the hyperbolas constituting the unit circle of a Minkowski 
metric are not equilateral, the aforementioned Euclidean characte¬ 
ristic of orthogonality of two vectors in the sense of Minkowski 
(their symmetry relative to one of the asymptotes) is no longer 
valid. More general (and true in all cases) is the following con¬ 
tention: two vectors are orthogonal in a given Minkowski metric 
if and only if they extend along the two conjugate diameters of 
the hyperbolas making up the unit circle in that metric. We omit 
the proof of this assertion. 

By giving different Minkowski metrics to the plane, we obtain 
distinct quadratic-metric spaces. They are of course all metrically 
isomorphic to one another. From the algebraic standpoint, their 
|[eometries are identical since their subject matter consists of the 
invariants of one and the same ^-orthogonal group. 

§ 9. Tensor algebra in quadratic-metric spaces 

1. We again consider quadratic-metric spaces of arbitrary di¬ 
mension and give a description of tensor algebra in such spaces. 
To be more exact, we will indicate special propositions of tensor 
algebra connected with the presence of a metric. Note that the ten- 
sorial apparatus proves to be useful in many problems of the theory 
of quadratic-metric spaces, particularly in cases where the circum¬ 
stances compel the use of arbitrary (nonorthonormal) bases. 

2. Let Rn be an n-dimensional linear space with a specified 
metric form 

= ( 1 ) 

where x = Y^x‘ei in an arbitrary basis c,, ..., e,,. 

Accordingly, for the scalar product we have 

{X, y) = g {x, y) = z gikx‘y’‘ (2) 

The tensor of the quadratic form (1) or, what is the same, the bili¬ 
near form (2), is called the metric tensor of the space Rn. This is 
a second-order covariant symmetric tensor whose components re- 
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lative to the basis ei, ..., are given by the multiplication table 
of the basis vectors: 

(Ci. ei) = gi, 

3. By G = II gik II we denote the matrix of the metric tensor in 
the basis eu e„, that is, the matrix of the form (2). Because 
of the nonsingularity of the form g{x, y) we have: det G =#= 0. 
Thus, there exists an inverse matrix G"’. The elements of G“' are 
indicated by the standard procedure—g with superscripts: 

G-=||g'"|| 

By the definition of an inverse matrix we have 

Zg‘’’g„ = f>‘, ( 3 ) 

4. Theorem 1. The quantities g'* are the components of a second- 
order contravariant tensor. 

Remark. The assertion of the theorem means that when passing 
to a new basis we have the transformation law 

grt' = Zg‘^Qi'Q^ (4) 

where g''^' are elements of a matrix inverse to |gf'fc'|| of the 
covariant metric tensor relative to the new basis. Equation (4) 
should be derived as a consequence of the transformation law 

grk'^Egn^‘rP^> (4a) 

which we know together with the definition of the metric tensor. 
However, it is technically difficult to derive (4) from (4a), and so 
the following proof of the theorem is based on an earlier described 
characteristic of tensor quantities (see Chapter V, Section 4). 

Proof. Consider the space Rn which is conjugate to the space 
Rn, and in it a basis e\ ..., e”, which is reciprocal to the basis 
. .. e„ e R„. 

Construct the transformation u = G(x), which to each vector 

x—Zx'‘ek^Rn associates a vector u = Zuie‘^Rn via the for¬ 
mula 

= Z gikx” (5) 

Since the orders are in agreement here (in the right member we 
have a contraction of a second-order covariant tensor with a first- 
order vector), the transformation u = G(x) is specified inva- 
rianlly. 

On the other hand, due to the fact that det G 0, the image of 
space Rn is the entire space Rn. This means that for every vector 
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«e /?;, there is a vector x s R„ such that u = G(x). This vector x 
is uniquely defined by 

= Z g'^ ( 6 ) 

To obtain ( 6 ) it suffices to solve the system of equations (5), 
regarding the x’^ as unknown numbers and the «, as known. 

Formula ( 6 ) shows that when g*'* is contracted with an arbitrary 
covariant vector Uk, the result is a first-order contravariant tensor. 
By a familiar characteristic of tensorial quantities we conclude 
that g'* is a tensor whose total order corresponds to the arrange¬ 
ment of the indices. The proof of the theorem is complete. 

5. The tensor g** is termed the contravariant metric tensor. 

6 . It is convenient in quadratic-metric spaces to make use of 
what is called the reciprocal basis of the given basis. 



Definition. The bases ei, ..., e„ and e' .e” in Rn are called 

reciprocal bases if (e‘, e/) = 6 /. 

Remark. In contrast to Chapter V, here the given basis and the 
reciprocal basis are taken in the same space. Later on it will be 
shown that the concept of reciprocal bases in a single quadratic- 
metric space actually reduces to that of reciprocal bases lying in 
the given space and in the conjugate space. 

Fig. 55 depicts reciprocal bases eu e-i and e', e^ in a plane with 
the ordinary Euclidean metric. By the definition of reciprocal bases, 
we have four conditions in this case: 

(e',ei)=l, (e',C2) = 0, 

(e^ e,) = 0, (e-, e..) = 1 

From them it follows that the vector e' is perpendicular to the 
vector 62 , and the vector e^ is perpendicular to the vector e\\ be¬ 
sides, since (e',ei )>0 and (e^ 62 ) > 0 , it follows that e\ e\ and 
also e^, 62 form acute angles. To lake precise account of the condi¬ 
tions (e', ei) = 1 and (e^, 62 ) = 1 in the figure requires specifying 
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a scale unit (these conditions determine the lengths of the vec¬ 
tors e', e* from the given vectors ei, ej). In the two-dimensional 
Euclidean case it is clear geometrically that the given basis uni¬ 
quely determines the reciprocal basis. At the same time the fol¬ 
lowing general theorem holds true. 

Theorem 2. For an arbitrary given basis e\, . in Rn the 

reciprocal basis e‘.e" is always determined uniquely. 

Proof. The vectors of the desired reciprocal basis can always be 
represented as 

(7) 

where A*® are unknown numerical coefficients. 

Form the scalar product of e, into the right and left members 
of (7). Noting that 

we get 

The product of an unknown matrix A = \\ || by a known non¬ 

singular matrix G yields the unit matrix £ = |6f|!i whence A = 
= G"‘, that is A'*® = g’'^. We thus get the only possible equations 

t '* = Z §*“^0 ( 9 ) 

Since det (g*®) =/= 0, the vectors e\ ..., e” defined by (9) are 
linearly independent and, hence, constitute a basis. It remains to 
see, via direct verification, that this basis is indeed reciprocal to 
the given basis. We have 

(e^ e,.) = Z (e„, e^) = Z 

which is what is required, and this completes the proof. 

7. Inverting (9) we get 

e*==Zgtae" 10) 


8. Forming the scalar product of e* into (9), we get 

(e\ /) = Z ej = Z g*''6a 

whence we obtain a multiplication table for the vectors of the re¬ 
ciprocal basis: 

(e‘,e^)=-.g‘^ ( 11 ) 


The right members of this table yield the components of the 
contravariani metric tensor. 
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9. Let (x, y) be arbitrary vectors in /?„. Expand them in terms 
of the reciprocal basis; the components with respect to this basis 
will (here and henceforth) be indicated by lower indices. Forming 
the scalar product of x by y and using formulas (II), we get 

ix, y)='Lu"‘xiUk ( 12 ) 

At the same time we have 

IUIP=Zg'V* (13) 

Formulas (12) and (13) are reciprocals of (2) and (1). 


10. Given two vectors x, w e R„. Expand x in terms of the basis 

. . . , 


X = 2 x‘ei 


Expand u in terms of the reciprocal basis e\ ..., e”: 

11 = 11 Uke" 


Form the scalar product 

(«, at) = S «ftX‘ (e*, Bi) = Yi Ukx'd’l = S HfcX* = H|x' + . , . 4- UnX" 

We see that when the vector x is expressed in terms of components 
relative to the given basis and the vector u in terms of components 
relative to the reciprocal basis, then the scalar product («, x) is 
expressed as a contraction. The vectors u and x can naturally be 
interchanged, in which case we have 

(«, x) = u'xi + ... + «"x„ 

11. In the space R„, to each linear form «(x) there uniquely cor¬ 
responds a vector h e/?„ such that w(x) = (w,x). In other words, 
every linear form in /?„ is uniquely representable as a scalar pro¬ 
duct. Indeed, relative to the basis cu . ■ ■, e„ we have 

«(x) = «|X*-f ... +UnX'‘ 

where «i, ..., Un are quite definite coefficients of the form u{x), 
whence u{x) = (u, x), where m = X (<?', ....e" is the reci¬ 
procal basis). 

12. The linear forms of the space /?„ are elements of the con¬ 
jugate space R‘„. By the preceding subsection, every form u{x)^Rn 
is associated with a vector u^R„ such that «(x) = (u, x). 

Clearly, this is a one-to-one correspondence between R,, and Rn- 
It is also easy to see that it is a linear isomorphism as well. 


10 * 
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Indeed, if u{x) = {u, x) and v(x)r={v,x), then «(;c) + (^) = 
s=(u + v, x) and a,u{x) = (au, x), where a is any scalar. 

13. Now we need not distinguish between Rn and /?„ if the 
elements of Rn are replaced by their images in R„ under the iso¬ 
morphism just indicated. Then every vector of /?„ is also a vector 
of Rn. Accordingly, we say that a quadratic-metric space is a self¬ 
conjugate space {Rn = R',). 

If X and u are vectors of /?„ but one is regarded as a vector of 
Rn and the other is viewed as a vector of then their contrac¬ 
tion in the sense of Section 2, Chapter V, coincides with the scalar 
product («, x)\ in this case we must assume that ei, ..., e /?,, 
and e'.e" e R',,. 

14. Suppose a given basis ei, ..., e„ transforms via matrix P by 
the formulas 

er^ZPrei (14) 

If we inircduce the basis e'', ..., e"', which is reciprocal to 

the new basis ey, ..., then 

e‘'' = ZQ/V (15) 

where, as usual, Q = lQi ||==(P‘)“'. There is no need to prove (15) 
because, by Subsection 13, the defining of reciprocal bases in Rn 
reduces to defining the reciprocal bases in Rn and R'n. Therefore, 
in order to establish formulas (15) it suffices to refer to the results 
of Section 1, Chapter V. 

15. Let X be an arbitrary vector in /?„. We can expand it either 

in terms of the basis e\ .e„ or the reciprocal basis .e": 

a: = X x^ei — X Xk^ (16) 

Trom (14) and (15) it follows that in passing to a new basis 
llio components a"' transform by the contravariant law and the com¬ 
ponents Aft transform by the covariant law: 

= Xk' = YjPl'Xk 

For this reason, a* and a^ are respectively termed the contravariant 
and covariant components of the vector a. From (9), (10) and (16) 
follow the formulas 


x'‘='Lgktx‘, 

Jc' = Z 


(17) 

(18) 
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which express (relative to the given basis) the covariant compo¬ 
nents of a vector in terms of its contravariant components, and 
also the contravariant components in terms of the covariant com¬ 
ponent's. As for the vector x itself, we can regard it with equal 
justification as contravariant or covariant (since R'„ = Iin). 

From what has been said in this subsection it follows that every 
first-order tensor (either contravariant x' or covariant X;,) may be 
invariantly represented as a vector in/?,, or X The 

first-order tensors x' and x;, represent one and the same vector in 
R„ if and only if they are related by the condition (17) or (18) 
(which one is immaterial since (18) follows from (17) and con¬ 
versely). 

16. It is easy to see that in /?„ multiple-order tensors can also 
be specified at pleasure by covariant, contravariant or mixed com¬ 
ponents. For instance, let 

a — Y, 

be a contravariant tensor of order two, that is an element of the 
tensor product /?„ (8> /?„. Replacing 6^ in accord with (10), we 
find 

a = Y a%iek = Z = Z = Z 

Set 

al = Za% (19) 

Then the same tensor can be represented as 

Z i k 
flfce.e 

which is an element of the tensor product /?„ ® /?,*. Formulas (19) 
are expressed in words as follows: the second upper index of ten¬ 
sor a is lowered by means of the metric tensor. From (3) and (19) 
follows 

a =Laag 

Here the lower index has been raised. 

Some tensor calculations call for a good deal of such “juggling” 
of indices (raising and lowering of indices). In such operations, 
it is common to fix the original positions of indices by dots. For 
example, in (19) is used instead of to emphasize that the 
second index is lowered and the first index remains a superscript. 

17. We have already mentioned that gih and g*'* are covariant 
and contravariant metric tensors respectively. However, due to 
formulas (3) and (10) we have the tensor equation 

Z gike'e'" = Z 
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And so it is best to say that there is one metric tensor and that 
gih and g’'^ are its covariant and contravariant components. 


18. Now let it be given that Rn has a positive index 
k {Q^k^n). Let eu . .. e„ be an orthonormal basis in 
provided that the first k vectors are unit vectors (and the others 
are imaginary-unit vectors). Then 


1 



0 





( 20 ) 


0 



(see Section .3, Subsection 1), whence 

G-' = \\g'''\\=E,^G (21) 

Due to (21) the basis e' .e", which is the reciprocal of the 

basis eu ..., e„, is also orthonormal and the first k vectors are 
unit vectors too. Besides, from (9) and (20), or from (10) and 
(21), follow the relations 

e‘ = 6/, /= 1, 2 , ..., ft; 

e' = — Bi. i = k + U ..n 


Thus, the unit vectors of reciprocal orthonormal bases corres¬ 
pondingly coincide and the imaginary-unit vectors differ in sign. 
Besides that, by formulas (17) and (18) we have 

x‘ — Xi, 1 = 1 , 2 , 

x‘ = — Xi, i — k I, ..., n 

Similar equations apply for tensors of any order. We will write 
them down in the particular case of a tensor of order three whose 
first index is lowered: 


a'u. = a'", /= I, 2, ..., k] 

a'ii‘. = — a'^‘, i — kI, ..n 

19. In the next section we will give some examples of the appli¬ 
cation of tensor algebra in quadratic-metric spaces. 
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§ 10. The e()uation of a hyperplane in quadratic-metric space 

1. Here we consider affine quadratic-metric space, that is, an 
affine space 21 that corresponds to a linear space L with quadratic 
metric (see Section 2, Subsection 5). In 21 let us specify a system 
of affine coordinates with arbitrary oritjin and basis ei, . . ., In 
this coordinate system we specify the equation of a hyperplane: 

* /li-Y* -f- ... -f C — 0 

or, briefly, 

ZAkx'‘ + C = 0 (1) 

In passing to a new basis (with origin preserved) we have 

k nfi It' 

X = 2^ Pk'X 

whence 

Z Akx'^ + C = Z AkPlx'^' + C = Z + C (2) 

where 

Ak' = Z AkPk' (3) 

The last expression in the chain of equations (2) is the left-hand 
member of the equation of the hyperplane in the new coordinate 
system. From (2) and (3) it follows that there is invariantly as¬ 
sociated with the left member of the equation of the hyperplane 
a vector 

n = {Au ..., /1„} 

with covariant components Ai . An, or 

tl A[6^ “F ... -|- AnS'^ 

where e', ..., e" is the reciprocal basis of the given basis Oy, ■■ ■, 

As for the running coordinates x', ..., jc" of an arbitrary point in 
the plane, they are, by definition, contravariant, since they arc the 
coordinates of the radius vector of that point relative to the given 
basis: 

x = x'ey+ ... 

From the foregoing, it is clear that the left-hand member of the 
equation of the hyperplane may be written in invariant form via 
the scalar product 

(n,.v)-fC-0 (4) 

2. If (a:^, ..., JY") is a fixed point of the hyperplane and xq is its 
radius vector, then C = —(n, Aq) and (4) becomes 

(n, X — .Yu) = 0 


(5) 
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or, expanded, 

-4, (x' - xi) + ... + 4 (x’ - = 0 

From (5) it follows that the vector n is orthogonal to any vector 
X — JCo lying in the hyperplane. Thus, the vector /i is a normal to 
the hyperplane given by equation (1). 

The contravariant components of the normal, that is, the com¬ 
ponents of n relative to the given basis ei, ..., e„ are given by 

= (6) 

where g’'‘ is the metric tensor. 

3. Problem. Given in two-dimensional space, in some coordinate 
system, the metric form 

IU-|P = 2(a:')2 .|_ 2a:'jc2 + (x2)2 

the straight line (one-dimensional plane) 3x* + 4x^ +10 = 0, and 
the point A{\, 1). 

Find the foot of the perpendicular dropped on the given straight 
line from point A in the given metric. 

2 1 

Solution. We have the matrix of the matric form: G— j j 

whence G“' = _j ^ , and so g" = 1, g** = g^‘ = —1, 

g 22 = 2. From the equation of the straight line we find its normal 
n — {3, 4} relative to the basis e\ e^. To obtain the components of 
the normal relative to the given basis ei, 62 , use formulas ( 6 ); 

^' = g'M, + g'M2 = -l, 42 = g2M,-|- g-M2 = 5 

From this we obtain (in the given coordinate system) the equa¬ 
tion of llie perpendicular to the given straight line passing through 
A{\, 1): 

.V' - 1 - 1 

-I ~ 5 

Solving this equation simultaneously with the equation of the 
straight line MN, we find point B{2, —4), which is the desired 
fool of the perpendicular. 

If in tlie plane (x‘, x^) we construct an ellipse || x || = 1 (the 
unit circle in the given metric), then the directions of the straight 
line MN and the perpendicular AB are conjugate relative to this 
ellipse because of Subsections 9, 10, Section 7 (Fig. 56). 
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Problem. Given in five-dimensional quasi-Euclidean space with 
positive index /e = 3 is a hyperplane x' + x* = I 

and a point /4(1, 1, 1, 1, 1). Find the foot of a perpendicular 
dropped on this plane from /I. It is known that the coordinate sy¬ 
stem is defined by an orthonormal basis whose first three vectors 
are unit vectors. 

Solution. From the equation of the hyperplane we find its normal 
n = {1, 1, 1, 1,J} relative to the basis e', .... e^, whence n = 



Fig. 6G 


= {1, 1,1,—1,-1} relative to the basis ei, ..., (see Section 9, 
Subsection 18). We thus have the equation of the perpendicular: 

x' -1 JC»-1 X^—\ X* -\ x^-\ 

1 “ 1 “ I “ -1 ~ -1 

Solving them simultaneously with the equation of the given 
hyperplane, we get the desired point: x'= —3, —3, x^= —3, 

X* = 5,x^ = 5. 

§ 11. Euclidean space. Orthogonal matrices. Orthogonal group 

1. Definition. Euclidean linear space is an n-dirncnsional linear 
space with quadratic metric, provided that its metric quadratic 
form g(x, x) is positive definite. 

The term Euclidean is also used for an n-dimensional affine 
space if the corresponding linear space is Euclidean. 

From now on we will be dealing precisely with such a space. 
That will enable us to consider both vectors and points. 

We will use the notation E,, for an n-dimensional Euclidean 
space. For the norm of a vector we will use the absolute-value 
sign: \x\. 
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2. In Euclidean space the Cauchy-Bunyakovsky inequality holds: 


Therefore 


(X, yf < (.T, x) • (y, y) 


(X, y) 


x) V(i/. y) 


<1 


equality occurring if and only if the vectors x, y are collinear 
(linearly dependent). 

Using this circumstance, we can introduce an angle between 
vectors. Namely, if x, y are nonzero vectors, then the angle bet¬ 
ween them is a number ip defined by 

(A, y) n ^ ^ 

cos qp = r—; —■ ■■ , 0 ^ qp ^ Jl 

^ Ul-I.vl 

Note that qp = 0 if and only if the vectors are collinear and in 
the same direction, whereas cp = ji signifies the vectors have op¬ 
posite directions. 

Using the angle, we can write the scalar product as is done in 
elementary vector algebra: 

(a:, i/) = I A-1 • I r/1 • cos qp 


3. In the orthonormal basis ei, ..., e„ we have 

I A p = (a')^ 4- {x!^? + ... + (A'»)^ 

(a, y) = x'y' + x^y^ -f- ... -f a"!/" 

Thus, the familiar formulas of analytic geometry for the length 
of a vector and for the scalar product carry over directly to the 
multidimensional case. 


4. For an arbitrary unit vector e we can introduce the angles 

a„ which it forms with the vectors of the orthonormal basis 

e\ . e„. The cosines of these angles, cos ai, ..., cos a„, are 

calk'd llic direction cosines of the vector e (relative to the given 
basis). It is easy to see that 

6 = 6, cos a, -f e-i cos 02 + ... -f cos a„ 

and that 

cos^ Oi -f cos^ 02 + ... + cos^ a„ = 1 

in complete analogy with the familiar relations of elementary ana¬ 
lytic geometry. 

5. By definition, an n-dimensional Euclidean space is a quad¬ 

ratic-metric space with positive index k = n. In E„ space, every 
orthonorinal liasis consists solely of unit vectors (there are no 
imaginary-unit vectors). If ei.is an arbitrary orthonormal 
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basis in £„, then a new basis 

Ck' = Z (1) 

will also be orthonormal if and only if the matrix P = ||P^|| satis¬ 
fies the condition of /i-orthogonalily for k = n (see Section 6, 
equation (2)). But when k — n the matrix denoted in Section 1 
by the symbol becomes the unit matrix E. From this we con¬ 
clude that in Euclidean space the transformation (1) of an ortho¬ 
normal basis ei, ..., e„ again yields an ortlionormal basis e,-, ... 
if and only if 

PP* = E (2) 

6. Definition. Every n X « matrix that satisfies (2) is said to be 
orthogonal. 

7. By Section 6, orthogonal n X « matrices constitute a sub¬ 
group of the group of all nonsingular n X « matrices. 

It is called the orthogonal group of n X « matrices and will 
henceforth be denoted by O. 

8. The set of all orthonormal bases in a given Euclidean space 
En is nothing but a class of bases relative to the orthogonal 
group, which class is generated by some one orthonormal basis. 

If a linear space L„ is given without a metric, then all the bases 
in Ln split into classes relative to the group 0. Each of these 
classes may be regarded as consisting of orthonormal bases if we 
introduce in L„ a certain definite Euclidean metric corresponding 
to precisely that class. The Euclidean spaces into which L„ is con¬ 
verted by specification in it of such metrics are distinct but metric¬ 
ally isomorphic. Their geometries are algebraically identical in the 
sense that the invariants of one and the same group 0 constitute 
the subject of investigation in each instance. 

In Section 7 we discussed similar things in detail for the two- 
dimensional case. 


9. Because of (2), (det P)* = 1, whence for every orthogonal 
matrix 


detP = ± 1 


Thus, the orthogonal group may be regarded as a subgroup of 
the group of matrices with unit modulus of the determinant (like 
all /j-orthogonal groups, as has already been pointed out). 

The matrices P e 0 for which det P = + 1 constitute the sub¬ 
group 0* of the group O. 

To matrices in 0* there corresponds a transformation of ortho¬ 
normal bases with orientation preserved, which, to some extent, is 
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analogous to a rotation of a (two-dimensional) basis in the Eucli¬ 
dean plane (see Section 7). Such transformations of bases in 
spaces of any dimension are usually termed rotations (about a 
fixed origin), and so the group O’*" is often called the rotation group 
(also see Sections 7, 8, Chapter IX). 

10. To a transformation of an orthonormal basis in via for¬ 
mulas (1), Subsection 5, corresponds a transformation of coordi¬ 
nates 

x''=i:q‘v (3) 

with matrix Q = lQr|| = (P*)“'. Because of (2) we have 

Q = P 

Thus, in Euclidean space the transition from one orthonormal basis 
to another via (I) with orthogonal matrix P is associated with a 
transformation of coordinates with the same orthogonal matrix: 
Q = P. 

11 . A transformation of type (3) of the variables x\ ..., x" 
into the variables x'\ ..., a:™' is called orthogonal if the matrix 
is orthogonal. Orthogonal transformations of variables may be 
described in purely algebraic terms without resorting to ortho¬ 
normal bases in E„. These are the transformations of type (3), 
and only such transformations, for which the identity 

(;c'T+ ... = ... +(^T 

holds true. 

12. The metric form in £„ space, relative to the orthonormal 
basis Cl, , e„, is 

UP = (a:‘)*+ ... +(xT 

whence, if G is the matrix of the form and E is the unit matrix, 
then 

G"' = G = £ 

Thus, gii=\, gih = 0 {i ¥= ft) and in exactly the same way 
g'' = 1, g'* = 0 {i # k). For this reason and on the basis of for¬ 
mulas (9) of Section 9 


which means that if the given basis is orthonormal in E„, then the 
reciprocal basis coincides with it. At the same time, the covariant, 
contravariant and mixed components of any tensor having identical 
indices coincide. For instance. 
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In particular, for any vector 

x’‘ = Xk 

(also see Subsection 18 of Section 9). For this reason, if the nature 
of a problem referring to Fuclidcan space is such that it is pos¬ 
sible to confine oneself to orthonormal bases, then the orders of 
the tensors are not distinguished and all indices on vectors and 
tensors are writteft as subscripts (as, for example, in the theory of 
elasticity). 

13. We now indicate a special notation for any orthogonal 
matrix. Imagine that we have a table of the angles that the vectors 
of a new orthonormal basis form with the vectors of the old ortho¬ 
normal basis: 




62 



e\' 

Ol 

P. 


Yi 

62 ' 

“2 

Pi 


Y2 


. . . 


• • * 


en' 

On 

P. 


Y« 


Using the conventional alphabet, a, p, ..., y, we have, by Sub¬ 
section 4, the relations 

e,' = e, cos oi -j- 62 cos Pi + ... -f 6 „ cos Yi, ] 


6„/ = 6|C0Sa„ -f 62 C 0 SP„ + ... +6„C0SY„ ) 

This is an expanded version of the formula (1) via the direction 
cosines of the new basis vectors. Matrix P has also been written 
down. Thus, any orthogonal matrix can be written via direction 
cosines as 



cos Oi 

cos Pi . 

. cosYi 

P = 

cos 02 

cosp, . 

. C 0 SY 2 


cosa„ 

cosp,, . 

• COSY,, 


With respect to this notation, observe the following characteristic 
property peculiar solely to orthogonal matrices, namely, in the 
case of an orthogonal matrix, the sum of the squares of the ele¬ 
ments of a column or a row :s equal to unity (this is due to nor- 
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iiiiilizalion of bases); the sum of the products of corresponding 
elements in two columns or two rows is equal to zero (this is be¬ 
cause each of the bases is orthogonal). 

14. Since P”' = P*, it follows that 



cos Oi 

cos 02 . 

. cos a„ 

P"‘ = 

cosp, 

COSP 2 • 

. cosp„ 


cos Yi 

COSY 2 • 

.. cosY„ 


15. From the preceding two subsections and equation Q = P we 
have formulas for the transformation of coordinates: 

a:,'= .T| cosoi + a: 2 COsP, + ... -f cos Yi, j 

x„' = a:, cos a„ -f- a ;2 cos 4- ... -f cos y,, I 

The inverse formulas 

Xi = ATi' cos O] + X,' cos 02 + • • • + COS a„-, 1 


a;,, = xr cos Yi + cos Y 2 + • • • + x^' cos j 
are obtained from the condition Q~' = Q* (or Q"‘ = P"*). 

§ 12. The normal equation of a hyperplane in Euclidean space 

1. Given in E„ a coordinate system with an arbitrary basis 
Cl, .... e„. Specified in this coordinate system is the equation of 
the hyperplane 

4,jc'+ ... -f 4„A:'‘-f C = 0 (1) 

where n = A,e' is the normal to the hyperplane, 

and c'.c" is the reciprocal basis (see Section 10). 

If for n wc take the unit normal no and the constant term is ne¬ 
gative (or zero), then under these conditions (1) is called the 
normal equation. Putting the constant term C = —p (p ^ 0) in 
this case, we write the normal equation as 

(rtj, a;) —p = 0 (2) 

where x = ac'cj + • • • + A'"^n is the radius vector of a running 
point of the hyperplane. 

In order to reduce the general equation (1) to normal form, 
multiply it by the normalizing factor 


1 
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choosing the sign with the condition that mC < 0; then p = 
= —mC > 0 (if C = 0 we can agree to take m with the plus 
sign). Obviously, no = mn is a unit vector. 

Denote by qj the angle between no and x; from (2) we have 

p = {n,„ Jc)==|.t|-cos(p (3) 

As in elementary analytic geometry, in the theory of multidi¬ 
mensional EucH|lean spaces this quantity is called the projection 
of the vector x on the normal with positive direction along the 
vector no. At the same time it is readily seen that p is the distance 
from the origin to the hyperplane. Indeed, from (3) 

P<\x\ 

and p =\x\ if cos qp = I (q) = 0). Thus, p is the length of the 
shortest of the radius vectors having termini on the given hyper¬ 




plane. The particular case of a two-dimensional plane in three- 
dimensional space is shown in Fig. 57. 

If X* is the radius vector of some point M* not lying on the hy¬ 
perplane, then the number 

6 = («o, X*) — p (4) 

is the distance from M* to the given hyperpiane, the sign being 
minus if M* and the origin 0 lie to one side of the hyperplane 
and plus if M* and 0 are on different sides. 

To see that this is so, consider a running point Af of the hyper- 
plane; if X is its radius vector, then from (3) and (4) we have 

6 = (n,„ X*) — (no, x) = — (n„, x — x') = (no, M*Af) 

and so |6|<|M*Af|; therefore |A| is the length of the shortest 
of the vectors M*M (see Fig. 58 where the space is three-dimen¬ 
sional). 
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2. In the particular case where the basis e\, ..., is orthonor¬ 
mal, the reciprocal basis coincides with it, and all the relations we 
have considered are completely analogous to the familiar facts of 
elementary analytic geometry. In this case 

d: 1 

m = ' 

V^i + ••• + 

and the normal equation may be written as 

cos a -f a :2 cos p -f ... + ;c„ cos y — p = 0 (5) 

where cos a, cos p, ..., cos y are the direction cosines of the 
vector tio. 

In an arbitrary skew basis, the notation (5) of a normal equa¬ 
tion is meaningless. But the fundamental aspect of the problem of 
reducing a general equation to normal form and of the problem of 
the distance of a point to the hyperplane does not become more 
complicated. The only thing to bear in mind is that the normalizing 
factor is to be computed from the general formula 


/#». —- - : 

VZ g'^AiAk 

(see Section 9, Subsection 9). 

3. Problem. Given on a Euclidean plane a straight line 
6a:‘ + 8^2 — 5 = 0 and a point M* (2, 1). Find the distance from 
the point to the line. The metric tensor is known: gn = 2, gi 2 = 
g 2 ! = 1, g 22 = 1 (in the given coordinate system). 

Solution. First of all let us verify that the indicated metric 
tensor defines the Euclidean metric. We have 

z=(x^)^ + (x 'equality being attained only for 
x' = x^ = 0. This means the metric tensor is indeed Euclidean. 

Inverting the matrix G, we get g" = 1, g'^ = g^' = —1, 
g 22 = 2. Furthermore, A\ = 6, A 2 = 8, whence, by (6), 

Y^ii'^AiAu = 68, m = 1/2 VT7 

and so _ 

6= 15/2 Vl7 

§ 13. The volume of a parallelepiped in Euclidean space. 

The discriminant tensor. Vector product 

1. Let the metric of the space £„ be defined by specification, re¬ 
lative to a basis ci, ..., e„, of the metric form 

UP = Z gikX'x'^ 
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Set g = det II gii, II. Since the metric form is positive definite 
in En, it follows that g > 0 . 

Consider in £„ an arbitrary oriented parallelepiped P constructed 
on the vectors Xi, ..., x,,. By Section 5, Chapter VI, we can deter¬ 
mine the oriented volume V of P by setting 

V = c'\Jg D(x .JC„) (1) 

where c is a constant common to all parallelepipeds. Choosing c 
is tantamount to choosing a unit of volume. The fact that in (1) we 
have used the discriminant of the metric form will help us to con¬ 
nect the unit of volume with the unit of length. For the unit we 
take the volume of an n-dimensional cube with unit side, that is to 
say, the volume of a parallelepiped constructed on the vectors of 
an orthonormal basis. Let e'*, .... e" be an orthonormal basis, 
go the determinant of the metric form relative to the basis e®, ... 

...,e®. It is clear that go = L 

On the other hand, the matrix made up of the components of 
the basis vectors relative to this basis is a unit matrix; with res¬ 
pect to the basis e'j, e'l we have D(e\ . e®)=l. Finally, 

by assumption the volume of a parallelepiped constructed on the 
vectors e°, ..., e® is equal to unity. From (1), by virtue of the 
foregoing, we find c = 1. Thus, with our choice of the unit of 
volume, _ 

V = '\/g D(xi, xj (2) 

From this it follows that the volume of a parallelepiped con¬ 
structed on the basis vectors ei, .... e„ is given by the formula 

V = Vg (3) 

2. Since g = 1 in an orthonormal basis, formula (2) for ortho¬ 
normal bases takes on the simpler aspect 

V = D(x„ .... a:,) 

3. By (2) and also by Section 5, Chapter VI, we have the dis¬ 
criminant tensor of a Euclidean space relative to any basis 

.^n' _ 

6 /,...(4) 

Since the discriminant tensor is skew-symmetric with respect to 
all indices, the syster^of equations (4) is equivalent to the single 
equation 812 ... « = Vg (since A 12 .. n = l).For n = 3 see Fig. 59. 
In orthonormal bases 
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4. Every linear subspace L* of dimension k lying in E„ is itself 
a A:-dimensional Euclidean space. Indeed, the scalar product of 
every pair of vectors is defined for Z,/, since it is defined throughout 
the space The metric form is positive definite in Lj, since 
I Af 12 > 0 for every x^En, x Q. 

5. By the foregoing, the volume of any parallelepiped (fe-dimen- 
sional volume) is defined in L*. If an orientation is given in Lk 


ef 


(by specification of a basis oi, .... Uk), then also defined in Lk is 
the oriented volume of oriented parallelepipeds. 

6. It is easy to obtain a formula expressing the ^-dimensional 
volume of a parallelepiped constructed on an arbitrary set of in¬ 
dependent vectors oi, ..., in £„. Merely take Oi, ..., a#, for a 
basis in the linear hull Lk of these vectors. The metric tensor of 
subspace Lj, relative to the basis Oi, ..., a* has the components 
Yi.i =(<^h ^j)< from this and (3) we have 



(fli, a,) 

(fli, 02 ) . 

•• (a,, ak) 

l/2 = det||Y//ll = 

(fl2> ^ 1 ) 

(02, 02 ) .. 

• (<^2. <^k) 


(ak, Qi) 

(O*. O 2 ) • ' 

■ • («A, ak) 


Thus the square of the desired fe-dimensional volume is given 
by the Gram determinant of the vectors a\ . an. 

7. We conclude this section with an application of the discri¬ 
minant tensor. In three-dimensional Euclidean space consider the 
vector iirodiict ^ = [a: X (/] of the vector x = x'e\ + x^e2-\-x^es by 
the vector y = y'e\ -f y'^e^ -f y^ez. It turns out that the covariant 
components of the vector product z — Z\e' -f- -j- are given 

by the following simple formula: 

== E (5) 

From (5) we immediately get a formula that yields the contra- 
varianl components of the vector product, that is to say, its com¬ 
ponents relative to the given basis e\, e^, e^-. 

2'= Zg‘''eAapAt“l/» 


4: 


>- 







( 6 ) 
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It is to be stressed that (5) and ( 6 ) permit computing a vector 
product relative to any (generally, nonorthonormal) basis. 

To prove (5), note first of ail that both members are first-order 
covariant axial tensors (the left member is an axial tensor by the 
definition of a vector product since it changes sign when the orien¬ 
tation of the basis is changed; the riglit member is an axial tensor 
because of the participation of the discriminant tensor, since it is 
an axial tensor). Equalities of tenscTr quantities are invariant and 
therefore it suffices to verify (5) relative to some specially chosen 
basis. If X and y are dependent, formula (5) is clearly true since 
the left and right members are then zero. Let x and y be indepen¬ 
dent. We take a basis with the first two vectors ei = .v, 62 = y- 
For ez take the unit vector orthogonal to e\, e 2 (Fig. 60). Then the 




vector product z = Se^, where 5 is the area of the parallelogram 
(su 62 ) (Fig. 61). From the definition of reciprocal bases it follows 
that here = 63 . Therefore z = Se® and so on the left-hand side 
of (5) we have 

Zi =0, Z 2 = 0, 23 = S 

Since a: = ei = {1, 0, 0), i/ = ^2 = {0, 1,0} it follows that 

Z = e, i. 

and so on the right of (5) we have, for i = 1, 2, 3, the numbers 
6112 = 0, 6212 = 0, 6312 = 6(23 = 

But is the volume of the parallelepiped (ei, 62 . ^ 3 ). And since 
€3 is a unit vector and is orthogonal to ei and 62 , this volume is 
equal to the area S. Thus s/g =S and the proof of formula (5) 
is complete. 
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LINEAR TRANSFORMATIONS OF 
EUCLIDEAN SPACE 


§ 1. Adjoint of a transformation 

1. In rt-dimensiona! (real) Euclidean space consider a linear 
transformation y = Ax. 

• 

Definition. A linear transformation y — Ax \s said to be the ad¬ 
joint of the given transformation A if for any x and 2 in E„ we 
have the following equation of scalar products: 

(Ax, z) = (x. Az) (1) 

2. Theorem. The adjoint A of a given transformation A is always 
uniquely defined. 

Proof. For a given vector z we seek a vector f such that the 
equation 

(Ax, z) = (X, f) (2) 

holds true for any x ^ Having found such a vector f, set 

• 

Az = f. It is required to prove that f exists, is uniquely defined, 
and depends linearly on 2 . To do this, introduce the basis Ci,..., e„ 
and the reciprocal basis e\ ..., e". Put x = e''; now if the desired 
vector / exists, then 

(Ae\ z) = (e\ f) (3) 

The scalar product (e'',f) is equal to the component f' of f re¬ 
lative to the basis C|,...,c„, and so from (3) we get/'' =(/4e'‘, 2 ). 
Hence, only the vector 

(4) 

is the desired vector f. 

• 

We now show that if f = Az is given by (4), then the condi¬ 
tion (2) holds for any jc e £„■ Expand x in terms of the reciprocal 
basis: x — Xte'-\-... putting this expansion into the left 

member of (2), wc get 
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Here, use is made of the expression for the scalar product of two 
vectors, one of which is expanded in terms of the given basis, the 
other in terms of the reciprocal basis. 

These calculations yield (4) for the vector Az and show that no 

other value is possible for Az, which demonstrates the existence 

• • 

and uniqueness of Az. The linearity of the transformation Az fol¬ 
lows from (4) and the linearity of the scalar product in the argu¬ 
ment z. The proof of the theorem is complete. 

3. In Chapter VII proof was given that a linear transformation A 
is associated with a mixed tensor A',,. In space without a metric, 
the upper and lower indices of the tensor are not related in any 
way. We are now considering quadratic-metric (inner-product) 
space and we can raise or lower the indices of any tensor accord¬ 
ing to Section 9 of Chapter VIll. The operations of raising and 
lowering indices will be used frequently in what follows. This 
makes it necessary to agree on which of the indices of the tensor 
Ak is to be regarded as the first and which as the second. 

We make the convention that the upper index of the tensor of 
a linear transformation is the first, and we will write A^.'k = Ak. 

Lowering the upper index, we obtain the covariant components 
of the tensor of the transformation A: 

^ik = 2 giaA’^k 

Raising the lower index, we get the contravariant components of 
the tensor of the transformation A: 

4. Suppose we have a matrix /I = j/t*! = of the transfor¬ 
mation y = Ax relative to an arbitrary basis e\, ..., e„. Relative 

o 

to the same basis, let us find the matrix of the adjoint A, which we 
denote by ||/life| = ||y4!*|. Consider the scalar product 
{Ax, z)= Yi gakA^'ix'z^ 

a. I. k 

Also consider the scalar product 

{x, Az)= 2] g„ix'A'^kz'‘ 

a, /. k 

We have obtained two bilinear forms that must be identically 
equal. This is only possible if all their coefficients coincide: 

Z ga/‘k= T. gak^'^i 
a a * 


( 5 ) 
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Contracting both members of (5) with the contravariant metric 
tensor g'i and using (3), Section 9, Chapter Vlll, we get the 
desired expressions of the elements of the matrix of transforma¬ 
tion A: 

A\-k = Z (6) 

At the same time this is an expression of the tensor of the adjoint 
transformation in terms of the tensor of the given transformation. 

Formula (6) may be written more briefly as A\'k — Ak.. 

5. The simplest aspect of the relation between a given transfor¬ 
mation and the adjoint is obtained in covariant components. From 
(5) we have 



Thus, the covariant components of the tensor of the adjoint are 
equal to the covariant components of the tensor of the given trans¬ 
formation with indices interchanged. 

The adjoint of a transformation is a very important concept and 
is frequently encountered in various divisions of mathematics and 
its applications. For this reason the notion of an adjoint transfor¬ 
mation forms the basis of the classification given below. 

6 . Let us indicate the transformations that are most simply 

related to their adjoints: 

• 

(1) A = A, self-adjoint transformations; 

« 

(2) A = — A, skew-adjoint (or skew) transformations; 

(3) A = A"', as will be shown later on, these transformations 
coincide with isometric transformations. 

It will be proved later on that any linear transformation in 
Euclidean space reduces to a product of self-adjoint and isometric 
transformations. In this connection, a particularly detailed study 
will be made of self-adjoint and isometric transformations. 

Skew transformations play an important part in mechanics. We 
will look into the geometric meaning of a skew transformation in 
the three-dimensional case (Section 6). 

§ 2. Lemma on the characteristic roots of a symmetric matrix 

I. Let A = II Aih II be a real matrix and p{K)= det(A —XE) its 
characteristic polynomial. The following lemma is valid. 

Lemma. If a real matrix A is symmetric, then all the roots of 
its characteristic polynomial are real. 



§3) 


SELF AD.IOINT TRANSFORMATIONS 


311 


2 . Proof. Let X be an arbitrary root of the polynomial /3(X). 
Then, firstly, the system 

Z i'^ik — -Vft = 0 (/=!,...,«) 

k 

has a nonzero solution (.V|, .... .<•„); secondly, the number X is also 
a root of p(X) since the coefficients of the polynomial are real 
(here X is the conjugate of the comj^lex number X; the bar above 
the letter will have the same meaning in the sequel). We now prove 
that the numbers xi, ..., Jr„ form a solution (which is obviously 
nontrivial) of the system 

S — ^^Ik) ~ 0 (/ = 1, ..., rt) 

k 


By the rules for operating with complex numbers we have 

Z ^k — S = Z (^I* — Xti=0 

k k k 


Thus 


Z ^Ik^k !Z ^Ik^k f'-Xi 

k k 


Multiply the first equation by Jf,- and the second by x, and then 
sum over i: 

Z AaXiXk = X Z XtXi = X ZI -"Ci P: 

f. k 

Z AtitX/iXi = X Z ^{Xi = ^ Z I P 

I. k 

The matrix A,/, is symmetric, and so 

Z AuiX/X^ = Z A/g/XiXjh = Z A/hX/jXi 

i. k l. k i. k 

hence 

^Zi^iP=^Zu,p 

But the solution {x\ .Xn) is not zero, i.e., Z I -*^1 P=^0- Hence 

X = X. Thus X is a real number and the lemma is proved. 


§ 3. Self-adjoint tran.sformations 
1 . A self-adjoint transformation A is characterized by the con¬ 
dition A = A. 

By (5), Section I, the matrix of a self-adjoint transfor¬ 

mation, which matrix is given relative to an arbitrary basis, is 
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characterized by the relation 

Z = Z (1) 

whence 

= ^k! ( 2 ) 

Thus, a distinguishing feature of a self-adjoint transformation is 
the symmetry of the matrix of the covariant components of its 
tensor. 

2. Due to (1) the matrix of a self-adjoint transformation, and 
only of such a transformation, is symmetric relative to an ortho¬ 
normal basis: Ak = A'i. In other words, relative to an orthonormal 
basis, the condition of self-adjointness of a transformation is ex¬ 
pressed by the matrix equation A* = A, where A is the transfor¬ 
mation matrix and the asterisk stands for the transpose. 

3. Lemma 1 . All roots of the characteristic polynomial of a self- 
adjoint transformation are real. 

Proof. We know that the characteristic roots of a transformation 
are invariant under a change of basis. Let us pass to an ortho¬ 
normal basis. The transformation matrix then becomes symmetric 
and the assertion of Lemma 1 will follow from the results of the 
preceding section. 

Lemma 2. Let e be cn eigenvector of the self-adjoint transforma¬ 
tion A, and let the subspace L<, be the orthogonal complement of 
the linear hull of vector e. Then Le is an invariant subspace for A. 

Proof. Let x e Lg. This means that (jc, e) = 0. Because of self¬ 
adjointness, {Ax, e) — {x, Ae). Taking advantage of the fact that e 
is an eigenvector, we have 

{Ax, e) = {x, Ae) — {x, Ke) = k{x, e) = 0 

In other words. Ax e and the proof of Lemma 2 is complete. 

Theorem. For every self-adjoint transformation there is at least 
one orthonormal basis consisting of eigenvectors. 

Proof. We carry out the proof by induction. In the one-dimen¬ 
sional case every nonzero vector is an eigenvector and therefore 
when n = 1 the theorem holds. Let n > 1 be any natural number. 
Suppose the theorem is valid for any self-adjoint transformation 
i n ZT,( 1 . 

Let y4 be a self-adjoint transformation in E„. Since all roots of 
P{k) are real, there will be at least one eigenvector e. We con¬ 
struct the orthogonal complement Le of the linear hull of e. The 
subspace is of dimension n — 1 and is an invariant siibspace 
under the transformation A. It is thus possible to regard A not on 
the entire space F„ but only on L,., where A is clearly also self- 
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adjoint. By the induction hypothesis, there is an orthonormal basis 
62 , ..., Bn in Le Consisting of eigenvectors. Adjoin the unit vec¬ 
tor 61 , which is collinear willi the eigenvector 6 . The vector 6 \ is 

orthogonal to all the eigenvectors 62 .e,, and so we obtain the 

desired basis ei, 62 , ..., 6 „, which completes the proof of the theo¬ 
rem. 

Corollary. Ev6ry self-adjoint Iransfonnation relative to an ortho¬ 
normal basis can be reduced to diaifonml form. 

As we know, the matrix A relative to this basis is written thus: 



(3) 


where ^ 1 , ..., Xn are the collection of all roots of the characteristic 
polynomial 

p(;i) = (-ir(^-A.,)(^-A.2) ... a-XJ (4) 

Here, the eigenvalue X* corresponds to the eigenvector e/, (to the 
basis vector with the same number label k). 

4. Lemma 3. The eigenvectors corresponding to numerically 
distinct characteristic roots are orthogonal to each other. 

Proof. Let 

Av = X|A:, Ay — Xoy 

where Xi =# X 2 . Form the scalar product of the first equation by y, 
of the second by x, and subtract: 

(Ax, y) — (Ay, x) = X, (x, y) — X 2 {</, x) = (X, — X 2 ) (x, y) 

Because of self-adjointness, 

(At, y) — (Ay, .v) = (Ax, y) — (y, A.v) = 0 

whence (x, y)= 0. Lemma 3 is proved. 

Lemma 4. If X is a root of multiplicity m of the characteristic 
polynomial of a self-adjoint transformation A, then rank 
(A — XE) = n — m, so that to the root X there correspond m li¬ 
nearly independent eigenvectors. 

Proof. According to Subsection 3, there is a basis in which the 
transformation matrix A is of diagonal form (3). Relative to this 
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basis, the characteristic matrix A — kE has the form 


A-kE = 



0 


0 


kn — k 


(5) 


For instance, let Xi be a root of multiplicity m of the charac¬ 
teristic polynomial (4), that is, Xi = ^2 = • • • = km, km+i #= 
^ k\, ..., kn Xi. Then the first m diagonal elements in matrix 
(5) vanish and the remaining diagonal elements are not equal 
to zero, so that 

rank(i4 — X|£) = rt —/n (6) 


By Section 2, Chapter Vll, equation (6) is valid in any basis. 
By Section 6, Chapter VII, the root Xi is associated with m inde¬ 
pendent eigenvectors. The proof of Lemma 4 is complete. 

5. To construct an orthonormal basis consisting of eigenvectors 
of a self-adjoint transformation, it is first of all necessary to find 
the roots of the characteristic polynomial: A,i, k^, ..., kn- They are 
all real but may be multiple. We will take note of which roots have 
the same values. Say, let ^1 be of multiplicity m: 

ki=k2= ... = km~k' 

A,m+i has multiplicity k: 

^m + l “ ••• ~kni + k~'k 

and so on. To the root k' correspond m linearly independent eigen¬ 
vectors e\ . Bm whose components are obtained from a homo¬ 

geneous linear system of equations with matrix A — k'E. Denote by 
L'm the linear hull of the vectors e\, ..., em- The subspace Lm is 
invariant and a transformation in it acts like a similarity trans¬ 
formation with coefficient k'. Therefore, every vector in I/m is an 
eigonveclor. Choose in !/,„ an arbitrary orthonormal basis ey, ... 
..., Cm- It riiay be obtained by orthogonalizing the system of vec¬ 
tors By, , e,n or via other considerations. 

We now find the eigenvectors Bm+i . em+k that correspond to 

the root k": denote their linear hull by Ll. Like L'm, the subspace 
L'k is invariant. In L* choose an arbitrary orthonormal basis 

. .. Bm+k- The vectors Bm+y, .... Sm+k are eigenvectors of the 

transformation that correspond to the root k". By Lemma 3, each 
of Uieso eigenvectors is orthogonal to any vector in L'm (in other 
words, L'm 1 L'k). 

Consequently, tlie vectors By, .... e,,,, Sm+i. • •., ^m+h taken to¬ 
gether form an orthonormal system. 
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We then pass to the next root and construct an invariant sub¬ 
space of appropriate dimension and in it an orthonormal basis. 
Continuing this process, we obtain the desired basis after a finite 
number of operations. 

Example I. Given in three-dimensional space an orthonormal 
basis ei, ^ 2 , and the transformation 

y, = X, + 2.V2 — 4x3. 
iji = 2.V1 — 2 x 2 — 2x3. 

!h = — 4X| — 2x2 -f •*^3 

It is immaterial whether the indices on the components are sub¬ 
scripts or superscripts since the basis is orthonormal (see Chap¬ 
ter VIII, Section 11, Subsection 12). 

The matrix of this transformation is symmetric: 

1 2 -4 

A= 2-2-2 
-4 -2 1 

The symmetrical nature of a matrix in an orthonormal basis in¬ 
dicates that the transformation is self-adjoint. 

Let us construct an orthonormal basis out of eigenvectors. We 
write the system 

2 — ^61/) X/ = 0 

In the case at hand it is of the form 

( 1 —X)X|-f 2 x 2 — 4.V3 = 0 , "I 

2xi + (- 2 - ;\,)_X 2 - 2 x 3 = 0, > (7) 

— 4X| — 2 x 2 -f (1 — A.) X 3 = 0 J 

Form the characteristic polynomial: 

l-X 2 -4 

p{X)= 2 -2-A, -2 = - (A.3 - 27A, - 54) 

- 4 - 2 I - A. 

Its roots are Ai = 6 , A .2 = A .3 = —3. 

Putting Ai = 6 into (7) we get a solution (vector): 

e, = {2. 1. -2} 

Next, substitute A 2 = A 3 = —3 to obtain a system of rank one 
with two independent solutions. These solutions are readily chosen 
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SO that they yield orthogonal vectors: 

e2={l, 2, 2}, e3 = {2, -2, 1} 

Normalizing (he eigenvectors thus found, we get the desired 
basis: 

^1 = {*/3> Vs' Vs}* 

^2 ={73* Vs* Vs)* 

^3 = {Vs* Vs* Vs} 

The matrix A of transformation A in the new basis can be written 
out at once without any calculations: it is of diagonal form. On 
the diagonal we have eigenvalues in the same order in which the 
corresponding eigenvectors lie in the basis: 

6 0 0 
; 4 = 0 -3 0 

0 0-3 

Example 2. The dimension is n = 2 and the basis e\, is ar¬ 
bitrary. In this case, the metric characteristic, that is, the compo¬ 
nents of the metric tensor, must be given. Let 

ffl| —(^1* fi|)=l> gl2=(®|. ^ 2 )=^ 

g2l — ^l)~U g'22 —(®2* ^2) = 4 

The basis is skew and so the orders (of the tensor) are essential 
and it is necessary that the indices be properly set. Suppose we 
have a self-adjoint linear transformation y = Ax: 

y' = x' + 4x7 "I 
1/2 = x‘ A-) 

It is required to reduce it to an orthonormal basis composed of 
eigenvectors. 

First of all we have to verify that the condition of self-adjoint¬ 
ness is indeed observed, which means we have to be convinced that 
the matrix 

^ 4 _ 1 4 

A] Al ~ i I 

will become symmetric (see (2)) after the indices are lowered by 
means of the metric tensor. In other words, the tensor 

Aik = 2 


must be symmetric. 
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Actually, it is enough to compare two components of this tensor: 
An and A 2 \. Their calculation yields; 

as! 

2 

A^i = Z = g .i/ll + g^iA] = 5 

a^l 

Thus, An = A 21 . The transformation is self-adjoint and we can 
apply the general theory. 

The system of equations Yj (aJ ~ kb'j) x'= 0 is written as fol¬ 
lows: 

(1-^)a:'+ 4x2 = 0, I 

.v:' + (I-A,)x2 = 0 / 

The characteristic polynomial is 

\-k 4 

1 i_x =^'-2^-3 

with roots k\ = —1,^.2 = 3. 

When ^ = Xi we get x‘ = 2, x* = —1 and thus the eigenvector 
/i={2, —1}. For k = k 2 we have the eigenvector /2 = {2, I}. By 
Lemma 3, these eigenvectors are orthogonal in the given metric. 
Let us calculate their norms: 

ll/|lP=Zff„8/^/f = 4, II/, 11=2; 

II /2 IP == Z gafi^2^2 = 12, II /2II = 2 V3 

Dividing /, and k by their norms, we get the desired basis 

^\~{U —'A}* ^ 2 —'A* 

Relative to this basis, the transformation at hand has the matrix 

An analogous problem in the multidimensional case requires 
considerably more involved computations. 

§ 4. Reducing a quadratic form to canonical form in an ortho- 
normai basis 

1. We know that every quadratic form may be reduced to ca¬ 
nonical form in some basis. 

The problem now involves a supplementary restriction: to carry 
through the solution within the class of orthonormal bases. 



TRANSFORMATIONS OF EUCLUJEAN SPACE 


(CH. IX 


31fi 

Let a quadratic form 

f {x, x)= 'Z a„XiXi 

be given in the orthonormal basis ei, «„• 

We introduce an auxiliary linear transformation y = Ax, which 
has the same matrix as the quadratic form relative to the original 
basis ei, ..., e„: 

A] = ati 

Using the fact that the basis is orthonormal, we have A‘i = Aii 
(see Section 9, Chapter VIII). Besides, Uij = Oj,. Hence the trans¬ 
formation y = Ax Is self-adjoint and so we have to find the basis 
i], ..., e„ consisting of the eigenvectors of this transformation. 

When passing to the basis eu .... e„, the matrix of the trans¬ 
formation y = Ax takes the form 

4 = Q4Q~' 

and the matrix of the quadratic form is 

A' = PAP' 


But the old basis and new basis are orthonormal and so matrix P 
is orthogonal; hence, 

P = Q, P*=P“'=Q“' 


From this it follows that in the new basis the matrices of the quad¬ 
ratic form and the auxiliary linear transformation coincide: 
A' = A. 

Relative to the basis ^i, ..., e„ the transformation matrix A has 
the diagonal form 

0 

and so the quadratic form can be written in canonical form as 
f(x, x) = X^x] + ■ • • + 



Here, X|, ..., Xn are the characteristic roots of the transformation A 
wliidi correspond to its eigenvectors ei, , e„. 

Conclusion. Naturally associated with every quadratic form spe¬ 
cified relative to an orthonormal basis is a self-adjoint transfor¬ 
mation. Reducing this transformation to an orthonormal basis auto¬ 
matically brings the quadratic form to canonical form. 
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2. Example. Let the following quadratic form be given in an 
orthonormal basis of three-dimensional Euclidean space: 

f {X, X) = X-f - 4 - ~ ^^ 2^3 + ^3 

It is required to reduce it to canonical form also in an orthonormal 
basis. 

Solution. The matrix of the quadwtic form is 



1 

2 

-4 

A = 

2 

-2 

-2 


-4 

-2 

1 


In the preceding section we considered a self-adjoint transfor¬ 
mation with precisely that matrix, and so we take advantage of the 
result and write down the answer: 

f (jc, x) = 6x-’ — — 3Jc5 

in the orthonormal basis 

^1 = {%' Vs’ %}> 

^2— {‘/s’ %}f 

^3 = ('/31 %t Vs) 

§ 5. The joint reduction to canonical form of two quadratic forms 

1 . Given in a linear space (without a metric) a fixed arbitrary 
basis ei,..., e„ and two quadratic forms: 

f (x, x)= Z aikX^x'‘, g {x, x)=Z gikX^x’^ 

Now, can a basis be found in which both quadratic forms will 
take on canonical form? 

Theorem. If at least one of two quadratic forms is positive defi¬ 
nite, there will be a basis in which both forms assume canonical 
form. 

Proof. Let the form g(x,x) be positive definite. Introduce a 
Euclidean metric in the linear space, taking g{x, x) for the metric 
form. If in the resulting Euclidean space we take an arbitrary 
orthonormal basis, the metric form g{x, x) assumes normal form. 
Then we pass to another orthonormal basis so that f{x, x) reduces 
to canonical form. In the process, the normal form of the metric 
form will clearly remain intact. The theorem is proved. 

2. In the practical case of a joint reduction of two quadratic 
forms to canonical form it is not necessary to break the search for 
the desired basis into two stages, as is done in the proof. 
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By Section 4, a quadratic form may be reduced to canonical form 
without going outside the class of orthonormal bases. Such a reduc¬ 
tion is done via a self-adjoint transformation. The fact that the 
transformation is self-adjoint does not depend on the choice of 
basis. We therefore do as follows. 

Take g{x, x) for the metric form and, using the coefficients of 
f{x, jc), seek in the given basis an auxiliary self-adjoint transfor¬ 
mation y = Ax. Its covariant tensor /I,-, must be symmetric and 
must be determined by the tensor equation Aij — a,j, which holds 
true in any basis. Now, to find the transformation matrix A all we 
need to do is raise an index of the tensor Aij via the metric tensor: 

A’^i = Z /ta/g"" = Z g''"aa/ (1) 

All the quantities in the right-hand member of this equation are 
given, and so the matter reduced to finding the eigenvalues 
Xi, ..., and the orthonormal basis of the corresponding eigen¬ 
vectors ei .of the self-adjoint transformation with the known 

matrix See Subsection 5 of Section 3. 

3. Further simplifications are possible. It turns out that there is 
no need to compute AJ = A'‘j in order to find the eigenvectors 
ei,..., e„ and the eigenvalues Ai, ..., Kn- 

Indeed, taking into account (1), we have 

^kl ^Ski ~ 2 ] Ska (^/ 

and so the system of equations 

Z(A/-A.6 /)a:/ = 0 (3) 


is equivalent, for any X, to the system 

Z («*/ — x' = 0 (4) 

Namely, due to (I) and (2) we have the equations 

^ - ^8ki) ~ 

Z (A/ - ) a:' = Z g*" Z («a/ - ^Sai) x' 

from which clearly follows the equivalence of the systems (3) 
and (4). 

Thus, everything reduces to solving system (4), and in place of 
the characteristic polynomial p(X) we have to consider the poly¬ 
nomial 


q{X) = d(il{aki - Igkj) 
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But it has exactly tlie same roots (counted according to multi¬ 
plicity) as the characteristic polynomial 

p (1) = del (yi/ - Kb)) 

For, due to (2). 

(^) = H • P (^) 

for any K, where g = det C = 5 ^ 0. Thuji differs from p{K) only 
in a factor that does not include K. Observe that q{K) and system 
(4) can be written down immediately from the given quadratic 
forms. 

4. Summary. Given in an arbitrary basis e\, ..., e„ two quad¬ 
ratic forms: 


fix, x)='Z ai,K‘x', g(x, x)=Y, giix‘x‘ 

where g(x, x) is positive definite. To reduce them jointly to ca¬ 
nonical form we solve the characteristic equation 

q {K) = det (oti — Kgki) = 0 

Let Xi, ..., be the roots (all real). 

Substitute these roots in succession in place of K into the system 
(4) and for each root find solutions { a :-’} of this system; if the mul¬ 
tiplicity of root K is equal to m, then there are m linearly indepen¬ 
dent solutions (see Lemma 4, Section 3). These solutions must be 
chosen so that they form an orthonormal system in the metric 
gix, x). 

Then all the solutions thus found will yield the matrix of the 
components of the vectors of the desired basis e\, ..., e„. If the 
numbering is correct, that is, vector eu corresponds to root Kk, then 
relative to the basis Si, e„ 'the given quadratic forms will as¬ 
sume the form 

f=i:h(x‘r, g=Z(xv 


5 . Example, n = 2. Given the forms 

f (x, x) = 2 (x')* + 1 0.v'x^ + 8 {x^y, 
g{x,x)= (x')2+ 2.t'x^-f 4 (x2)2 

Reduce them jointly to canonical form. 

It is easy to verify that g{x, x) is positive definite. This is 
evident, for instance, from the equation g{x, x) = (x'-f-x^)^ + 
-|-3(x^)^ A joint reduction is possible and it can be carried out 
by the foregoing procedure. 


11-661 
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Write out the matrices of the quadratic forms: 



2 

5 


1 1 

F = 

5 

8 

. G = 

1 4 


and form the characteristic polynomial: 


q{X) = del (F - KG) = 


2 —A, 5 —A, 

5 - A, 8 — 4A, 


The roots are A,i = —1, h = 3. System (4) becomes 
{2-^)a:'+(5- A.)x2 = 0, ■) 

(5-A,)a:' + (8-4A.)a;2 = 0 J 

For A, = A-i = —1 we find the solution a :' = 2, = —1 and 

write it as a vector: /i ={2, —1}; for A. = X 2 = 3 we find the so¬ 

lution 4 = {2, I}. The vectors l\ and 4 are orthogonal in the 
metric g because they are the eigenvectors of the auxiliary operator 
that is self-adjoint in the metric g and correspond to different 
eigenvalues. The basis in which the forms are given is not ortho¬ 
gonal in the metric g. For this reason, the orthogonality of l\ and 
4 is not immediately apparent. But in checking the computations 
the reader will see that (4, 4) = g(4, 4) = 0. 

Let us now compute the norms of /i and 4 in the metric g: 

ll4ll = Vgi^ir7T=2, ll4ll = Vg(4. 4) = VT2 

Normalize the basis: 

^, = {1. -V 2 }. e 2 ={ 2 /Vi 2 . 1 /VT 2 ) 

In the basis ci, 62 the quadratic forms are 

f = -(x'r + 3{xy. 

Remark. The eigenvectors Z| and 4 also form a basis in which 
both forms have canonical form, but with different coefficients of 
the squares. The normalization of l\ and 4 in the metric g is 
needed so as to be able to write out the reduced quadratic forms 
witliout any supplementary computation of their coefficients and 
make use of the earlier found roots of the characteristic equation. 


§ 6. Skew-adjoint transformations 
1. Recall that a linear transformation z = Ay in Euclidean space 

is said to be skew-adjoint (or skew) if /4 = — A. 

Exhibiting this relation relative to some basis, we get 

A{k = -A{u 
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Therefore, the condition that A is skew-adjoint in the components 
(relative to an arbitrary basis) is written thus: 

= - S 

or 

Aik = — Aiii 

In an orthonormal basis, the orders of the tensors are of an 
equal status, and the matrix of a skew-adjoint transformation will 
also be skew-symmetric. 

Note that a linear transformation can always be written so that 
it is defined by covariant components. To do this, it is necessary to 
represent the argument y in the given basis: 

y = y'e^ A-y^e.i+ ... 4- i/X 

and expand the function z in terms of the reciprocal basis: 

2 = 2|e' + 22^2 + ... + 

Then transformation z = Ay can be written as 

where Ath is the covariant tensor of the transformation A. 


2. We now prove that in the three-dimensional case the skew 
transformation z = Ay can be represented as a vector multiplica¬ 
tion of a fixed vector a by the vector y. 

Theorem. If z — Ay is a skew-adjoint transformation in three- 
dimensional Euclidean space, then there exists a unique vector a 
such that 

z = \aX y\ 

Remark. Since the transformation Ay is invariant under a 
change of basis, the vector a in the equation Ay — \aX y] is not 
invariant but axial (an axial tensor of order one). 

Proof of the theorem. We know that 

2/ = E Aikl/ 

On the other hand, if 2 = [a X f/1. then 

Zi = E ^iakd"!/" 

where eiah is the discriminant tensor. 

To prove the theorem, we have to ensure the following equa¬ 
tions: 

E e.afen" = A,k 

where A^ is the given tensor, Am = —Aik (b A: = 1, 2, 3), and 
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a — {a', a^, a®) is the desired vector. We have a system of nine 
linear equations for the three components of the vector a. We have 
to prove its consistency by utilizing the skew symmetry of the gi¬ 
ven tensor An, and the special form of the lefthand member. The 
system of equations is written as a tensor equation and so we can 
verify the equation in terms of arbitrary components, and thus 
simplify the problem via a special choice of components. We rea¬ 
son as follows. 

The determinant of a skew-symmetric matrix of odd order is 
always zero. (This property is obvious for the three-dimensional 
case at hand.) For this reason, the bilinear form 

f («. w) = Z Atku‘v'‘ 

which corresponds to the tensor Ai^ is singular and its zero sub¬ 
space has dimension not less than unity. 

Choose an orthonormal basis e\, e^, so that the vector is in 
the zero subspace of the bilinear form (it is immaterial whether it 
is in the right or left subspace). 

Then the matrix Aih is simplified in the following manner: 



0 

A\2 

0 

II 

— -^12 

0 

0 


0 

0 

0 


The principal component of the discriminant tensor ei 23 = 1 be¬ 
cause the basis is orthonormal, whence, as is readily computed. 



0 

— a? 

a^ 

!l Z e/afefl" 1! = 


0 

-a' 


— a^ 

a' 

0 


It is now clear that for the matrices 11^4,ill and || e/ai,a“|| to 

coincide, it is necessary to put 

a = (o', a^, a^} = (0, 0, — Ay^} 

No other vector is suitable. The proof of the theorem is complete. 

3. Mechanical interpretation. In three-dimensional Euclidean 
space, fix a point O and through it draw a straight line collinear 
with a vector a. Take this line for the axis of rotation. Let us find 
the distribution of linear velocities of points of a rigid body rotat¬ 
ing about this axis with constant angular velocity (i)=|a|. The 
linear velocity v depends solely on the position that the moving 
point occupies at a given time. This position is characterized by a 
radius vector OM = y (the moving point of the body passes 
through the geometric point M in space). The velocity v is ortho- 
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gonal to the plane in which the vectors a and y lie. The numerical 
value of the linear velocity |y| is equal to the product of the an¬ 
gular velocity |a| by the distance of the point from the axis of 
rotation, and this product coincides with the area of the parallelo¬ 
gram constructed on the vectors a and y (Fig. 62). So with a 
proper choice of orientation of the basis, the velocity of any point 
of the rotating body is expressed by the formula o = [a X y\- 
Thus, any skew transformation in may be interpreted as 
a distribution of velocities of a uniformly rotating body: the point 



with radius vector y has an instantaneous linear velocity v = Ay\ 
the vector a of angular velocity, and so also the axis of rotation, 
are found via Subsection 2. 

§ 7. Isometric transformations 

1. Definition. A linear transformation / is said to be isometric 
if it preserves the norm of every vector; 

l|/-v:|| = l|.v|| (1) 

From now on we will be dealing solely with linear transforma¬ 
tions in Euclidean spaces. 

2. From the definition it follows that if an isometric transforma¬ 
tion exists, then it is nonsingular, since a singular transformation 
would carry a nonzero vector into a zero vector. Therefore the iso¬ 
metric transformation z = ly has an inverse: // = /'X, which is 
also isometric. 

Remark. For the invertibility of an isometric transformation it 
is essential to assume that t!ie space is linite-dimensional. 

3. Theorem. // / is an isometric transformation^ then (lx, ly) = 
= (x, y) for any pair of vectors x, y. 

Corollary. Since an isometric transformation preserves norms 
and scalar products, it also preserves the angle between any two 
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vectors, that is, the angle between them is equal to the angle 
between their images. 

Proof of the theorem. Substituting into (1) the sum x-\-y \n 
place of X and squaring both members, we get 

ll/U + ^)lP = IU + i/|p 

or 

(/(x + y), I{x + y)) = {x->ry, x-\- y) 

Using the linearity of /, we obtain 

{Ix, Ix) + 2 {Ix, ly) + (ly, ly) = (x, x) + 2 {x, y) + {y, y) 

But here (Ix, Ix) = \\ Ix \\^ = (x, x), (ly, Iy)= \\ ly = (y, y) 
and so 

(Ix, ly) = (a:, y) (2) 

which proves the theorem. 

4. Henceforth we assume the space to be Euclidean and n-di- 
mensional and we denote it by 

Set ly = z \n (2). Then y = I-'z, whence 

(Ix, z) = ix, r'z) (2') 

When the vector y runs through the entire space E„, the vector z 
does likewise because I is nonsingular. Thus, (2') holds for all 
vectors x, z in £„. This means that 

r'=i (3) 

Remark. It is readily verified that the three conditions (1), (2) 
and (3) are equivalent, that is, each one implies the other two. 

5. The relation (3) may be rewritten as follows: 

// = II = E 


where E is the identical transformation, whence it follows that the 
matrix of I is orthogonal relative to an orthonormal basis. For this 
reason, isometric transformations are the only ones which have 
orthogonal matrices in an orthonormal basis. 

6. Let e„ be an orthonormal basis and let 

^^1 = ^ 11^1 + + ••• +/«!««, 


1 1 ( 1^1 + / 2 n ^2 + • • • + 
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The vectors Ici, .... /e„ also form an orthonormal basis because 
an isometric transformation sends unit vectors into unit vectors, 
and orthogonal vectors into orthogonal vectors. 

By the foregoing, the transformation matrix || /,/, || is orthogonal. 

Thus, with every isometric transformation is associated an ortho¬ 
gonal matrix, and with every orthogonal matrix is associated an 
isometric transformation. F'or every pair of orthonormal bases 
there is a unique isometric transforrwalion that carries one of the 
given bases into the other. 

7. Note some properties of isometric transformations. 

(1) If the transformation I has an eigenvector e, that is, if 
le = Xe, e ^ 0, then = ±1 (this follows immediately from the 
preservation of norm of a vector). 

(2) det / = ± I, for the deteiminant of an orthogonal matrix 
is always equal to ± 1, and det / is an invariant. Hence to prove 
this property it suffices to consider the transformation / in an or- 
thonormal basis. 

If det / = -h 1, then the bases ei, ..., and /ci, ..., /e„ are 
of the same orientation and we have a transformation similar to 
the motion of a rigid body. 

If det / = —1, the bases e\, ..., e„ and le\, ..., Ie„ have 
different orientations and the transformation is a reflection. 

(3) If e is an eigenvector and Le is the orthogonal complement 
of the linear hull of e, then Le is an invariant subspace. 

Proof. Let xeL*. then (fx, le) = {x, e) = 0. On the other hand, 
{Ix, Ie)= K(lx, e) so that (Ix, e)= 0, that is, /x e Lg. 


8. Let us consider some examples. We assume that the transfor¬ 
mation matrices are written out in orthonormal bases. The ortho¬ 
gonality of the matrices (and hence the isometric nature of the 
transformations that follow) is established by a simple check, 
which we leave to the reader. 

(1) The identical transformation is isometric. 

(2) The reflection of n-dimensional space about the hyperplane 
Xi = 0 carries an arbitrary vector x = Xi^i -f ^ 2^2 + • • • + 
into the vector lx = —xjCi + X2e2-f-... 4 -XnS,,. The matrix of 
this transformation is 


/ = 


-1 



0 


0 


1 
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The subspace jC| =0 is invariant and the transformation induced 
in it is identical. 

(3) n = 1. In a one-dimensional space, an isometric transfor¬ 
mation is either identical, Ix = x, or is a reflection, lx = — x. 

Indeed, in the one-dimensional case the matrix I consists of only 
one clement /u. Taking into account the second property of the 
preceding subsection, we find l\\ = det || /n II = ± 1. 

(4) n = 2. A rotation of a plane through an angle 0 is an iso¬ 
metric transformation (see Section 7, Chapter VllI). 

(5) n = 3. The rotation of space through an angle 0 about the 
axis xz is given by the matrix 



COS0 

— sin 0 

0 

1 = 

sin 0 

COS0 

0 


0 

0 

1 


C 3 is an eigenvector and the subspaces ^(^ 3 ), L(ei, 62 ) are inva¬ 
riant. 

( 6 ) A multidimensional generalization of the preceding example 
is the rotation of n-dimensional space about an (n — 2 )-dimen¬ 
sional subspace L{e 3 .e„): 


COS0 

— sin 0 

0 

sin0 

COS0 


1 


0 


I 


The subspace Lies ,..., e„) remains fixed and an identical trans¬ 
formation is induced in it. 

(7) n = 4. A transformation with the matrix 



cos a —sin a 0 0 

sin a cos a 0 0 

0 0 cosfl — sinp 

0 0 sinp cosfl 


may be regarded as a simultaneous rotation of the space £4 in 
two mutually orthogonal directions: through the angle a about the 
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«n 

plane Z. ( 63 , 64 ) and through the angle B about the plane 

L( 6 |, 62 ). 

The planes L{eu 62 ) and 7 .( 63 , ^ 4 ) are invariant subspaces. If the 
angles a, p are not multiples of n, then the transformation / does 
not have any eigenvectors since its characteristic polynomial 

p(A,) = (A,^ —2A,cosa+ 1^(A,'^ — 2A,cosp + 1) 
has only complex roots. 

9. To get a better feeling of the specific nature of isometric 
transformations in spaces of dimension greater than three, com¬ 
pare the last example with a rotation of three-dimensional space. 

We assume that the spaces rotate uniformly and the matrices 
in examples (5) and (7) of Subsection 8 characterize the rotation 
in unit time. Then a rotation in the three-dimensional case during 
time t is specified by the matrix 

cos 07 — sin 07 0 

/ (7) = sin 07 , ■ cos 07 0 

0 0 1 

in the four-dimensional case by the matrix 

cos at — sin a7 0 0 

sin at cos a7 0 0 

0 0 cosp7 — sinp7 

0 0 sinp7 cosp7 

Clearly, in the three-dimensional case all points of the axis of 
rotation are fixed, while the remaining points of the space describe 
circles whose centres lie on the axis of rotation. The planes of 
these circles are perpendicular to the axis of rotation. Each point 
performs a complete rotation during the same time T = 2n/0. 

The picture is different in the four-dimensional case. Only the 
origin is fixed. The points of the invariant planes L{ei, 62 ) and 
L{e 3 , 64) move in circles centred at the origin, but the rotation 
periods about the origin in these two planes differ: in the plane 
^(«i, ^2) the period is Ti = 2 n/a and in the plane L(63, 64) the 
period is T 2 = 2jt/p. If the angles a and p are incommensurable, 
then every point of space that docs not belong to one of these two 
planes moves along a closed path without self-intersections and 
will never return to the original position. Indeed, for a point to 
return to its original position, it is necessary and sufficient that 
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its projections on the planes L(ei, e^) and L{ez, e^) return simul¬ 
taneously to their initial positions, which is impossible since the 
periods T\ and T 2 are incommensurable. 

If the angles a and p are commensurable, the periods T\ and 
are also commensurable, and then the paths of all points are 
closed but will not in general be circles. In this case, the rotation 
period of points not lying in the planes L{e\, € 2 ) and L(e 3 , e^) is 
equal to the least common multiple of the periods T\ and T 2 . 

§ 8. The canonical form of an isometric transformation 

I. Theorem. For every isometric transformation I in n-dimen- 
sional Euclidean space E„ there exists an orthonormal basis 
e\, e„ in which the transformation matrix has the following 
canonical form 


± 1 

0 


1 

• 


« 

* 1 


/e. 

• 

0 

■•r.. 


Here we use /e to denote the matrix of rotation of a two-dimen¬ 
sional plane through the angle 0: 


COS0 

— sinG 

sinG 

cos 0 


The sign ± in front of the first diagonal element of matrix (I) 
coincides with the sign of the determinant of the transformation. 
The submatrices /«, may be altogether absent (compare the first 
three examples of Subsection 4, Section 7) or may occupy the 
entire diagonal (see example (7) of Subsection 4, Section 7). 

The proof of the theorem is given below in Subsection 8 of this 
section. 

2. Now let us look into the geometric meaning of this theorem. 
It Is easy to verify that matrix (1) can be represented as a pro- 
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duct: 


‘ 0 
0 



I 

0 


1 


X 

/ 




I 


0 

1 


i 1 

0 


1 


0 


( 2 ) 


1 


Formula (2) shows that the transformation / reduces to the fol¬ 
lowing. 

If det I = —1, then first a reflection is performed about the 
hyperplane, which, relative to the basis ei, ..., e„, has the equa¬ 
tion JCi = 0 (if det / = -j- 1, there is no reflection). 

Following that, the linear hull of the first n — 2k basis vectors of 
L{e\, ..., en- 2 h) remains fixed, while the entire space is succes¬ 
sively rotated through the angles 0i, ..., 0* about the (n — 2)-di- 
mensional subspaces L(ei, ..., e„- 2 i,+ 3 , •••, e„), ... 

..., L(ei, ..., e„_ 2 ). All two-dimensional planes L(e„- 2 i,+i, 
^n- 2 h+ 2 ), • • •. ) io the directions of which these rotations 

occur are orthogonal among themselves and also to the subspace 
L(ei, ..., e„_2A). 


3. Corollary to the theorem of Subsection 1. (1) Let the iso¬ 
metric transformation / have det / = —1. Then it has an eigenvec¬ 
tor and, for even n, at least two independent eigenvectors. 

(2) In two-dimensional Euclidean space only three types of iso¬ 
metric transformations are possible: 

(a) the identical transformation; 

(b) a reflection relative to some one-dimensional subspace; 

(c) a rotation through an angle 0(0 < 0 < 2 n). 

(3) In three-dimensional Euclidean space only four types of 
isometric transformations are possible: 

(a) the identical transformation; 



332 


TRANSFORMATIONS OF EUCLIDEAN SPACE 


ICH. IX 


(b) a reflection relative to a two-dimensional subspace; 

(c) a rotation through the angle 0(O<0<2n) about a one¬ 
dimensional subspace; 

(d) the product of a reflection relative to some two-dimensional 
subspace by a rotation about its orthogonal complement. 

4, We now take up the proof of the theorem stated in Subsec¬ 
tion 1. First, however, in Subsections 5-7 we will establish a few 
auxiliary propositions that are of interest in themselves. 

5. Lemma 1. For every linear transformation in a real linear 
space L there exists either a one-dimensional invariant subspace or 
a two-dimensional invariant subspace such that the transforma¬ 
tion induced in it has a positive determinant. 

Proof. If the characteristic polynomial p(X) has a real root ^i, 
then by Section 6, Chapter VII, to this root corresponds an eigen¬ 
vector whose linear hull is an invariant subspace. 

Let p(K) have no real roots. Then the transformation A does not 
have a single eigenvector. 

We write down the relation (A — XE)x = Q relative to an ar¬ 
bitrary basis ei, ..., e„ and substitute for X the complex root 
a + ip of the characteristic polynomial p(X). This yields a homo¬ 
geneous system of linear equation with the unknowns Jt'. 

and with complex coefficients. This system can be written in 
matrix form as follows: 

x' 0 

(^-(a + /p)£) : = : (3) 

x" 0 

The determinant of system (3) is zero: 

det (/t - (a -f /p) £) = p(a -f /p) = 0 

and so (3) has the nontrivial solution (x'.x"). Decompose it 

into the real and imaginary parts: 

-+• i2' 


y" + i2'’ 

and consider the vectors 

!/ = !/‘^i + • • • + ^ L, 

Z = z'Ci . -4- ^ ^ 




( 4 ) 



§ 8 ) 


CANONICAL FORM OF ISOMFTRIC TRANSFORMATION 


333 


We use the same symbols y and z to denote the elements 
{y\ •••. y") and ( 2 ', .... 2 '') of the coordinate (component) 
space/Cn. Putting solution (4) into system (3) we get 


0 

0 


= (/I — (a + /p) E) iy + iz) =y\y — ay + P2) + i^Az — 02 -f p</) 


whence 

Ay = ay — fiz, 4 
/l2 = pi/ + a2 j 


(5) 


Equations (5) are obtained algebraically as relations between the 
elements of the coordinate space Kn- Geometrically, formulas (5) 
express the action of the transformation A on the vectors y, z ^ L, 
written in the basis eu .... e„. But the transformation does not 
depend on the choice of basis, and so (5) may be regarded as in¬ 
variant vector equations in the space L. 

We now show that the vectors y and 2 are linearly independent. 
First of all, 2 = 7 ^ 0, since otherwise the first of equations (5) would 
signify that L has an eigenvector y{Ay = ay). Therefore, if there 
is a linear relationship between y and 2 , then 

y = yz (6) 


Substituting (6) into the second equation of (5), we get Az = 
= (a + PY) 2 , which likewise contradicts the absence of eigen¬ 
vectors in the transformation A. 

Thus, y and 2 are linearly independent and their linear hull 
L(y, 2 ) is two-dimensional. 

Formulas (5) show that L{y, 2 ) is an invariant subspace of A 
and permit finding the determinant of the transformation induced 
in L{y, 2 ). This determinant is 


a 

P 



= + p2 > 0 


since p is definitely nonzero (otherwise the root X = o -f- ip would 
be real). The proof of Lemma 1 is complete. 

6. Lemma 2. Given in Euclidean space E„ an isometric transfor¬ 
mation I. Let the subspace E' be invariant under 1. Then the ortho¬ 
gonal complement E" of E' is also an invariant subspace. 
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Proof. Lemma 2 follows from the fact that an isometric trans¬ 
formation is nonsingular and preserves orthogonality of vectors. 
Indeed, if x e t/ e E'\ then (x, y) = 0 and (/x, ly) = 0. When 
vector X runs through the entire subspace its image lx also 
runs through E' entirely (see Subsection 2, Section 4, Chapter VII). 
Hence, vector ly is orthogonal to E' and therefore lies in £". The 
vector y in E” may be taken arbitrarily. Thus, l{E'')ci E”. 

7. Lemma 3. In two-dimensional Euclidean space, every iso¬ 
metric transformation / with positive determinant is a rotation 
through an angle 0. 

Proof. Take an orthonormal basis C|, e 2 . Let the vector h] form 
an angle 0 with the vector e\. Since the length of lei is equal to 
that of ei, it follows that lei is obtained from ei by a rotation 
through the angle 0. The vector le^ is orthogonal to hi and the 
orientation of the new basis lei, is the same as that of the 
original basis (since det / > 0). Hence, le^ is obtained from e^ by 
a rotation through the same angle 0. The transformation / pre¬ 
serves the lengths of all vectors and the angles between any two 
vectors, and so all vectors rotate through the same angle 0. In the 
particular case of 0 being a multiple of 2ji, the transformation is 
identical. 

8. Proof of the theorem. Lemmas 1 and 2 permit decomposing 
the space E„ into a direct sum of one-dimensional and two-dimen¬ 
sional invariant subspaces. By Lemma 1 there exists an invariant 
subspace £(i); by Lemma 2 its orthogonal complement E is also in¬ 
variant, and Lemma 1 can again be applied to it, etc. We thus 
obtain 


£ = £'(i)©£( 2)© ... ©£(p) (7) 

where p = n — ft (^ is the number of two-dimensional subspaces 
in the sum (7)). 

We assiime that in the right member of (7) first come one-di¬ 
mensional subspaces in which / has eigenvalues -f-l, then one¬ 
dimensional subspaces in which the eigenvalues are equal to —f, 
and finally two-dimensional subspaces in which there are no eigen¬ 
values (and the determinant of the induced transformation is posi¬ 
tive in accordance with Lemma 1). The transformations induced 
in the two-dimensional subspaces of (7) are rotations through the 

angles 0.0/i by Lemma 3. In each of the subspaces £(i), ... 

..., £(,,) choose an orthonormal basis. Their union will yield the 
orthonormal basis C|, ..., of £„ since all Eq) are pairwise or¬ 
thogonal. Relative to the basis ei, e„, the transformation 
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matrix becomes 

0 



10 ■ 

Noting that 


cos Jl 

— sin Jt 


-1 

0 

sin n 

COSJt 


0 

-1 



( 8 ) 


(9) 


we can replace the even number of minus ones on the diagonal 
of matrix (8) by one half that number of submatrices of type (9) 
Geometrically, this means that the product of two reflections of 
the plane with respect to mutually perpendicular straight lines is 
equal to a rotation of the plane through the angle n. 

If the number of minus ones is odd, one of them will fail to 
enter into a submatrix of type (9) and it can then be shifted to 
the start of the diagonal by renumbering the basis vectors. 

Then matrix / will assume the form (1) (the number k will ge¬ 
nerally change compared with formula (8) due to the appearance 
of new submalrices of type (9)). The theorem is proved. 


§ 9. The motion of a rigid body with one fixed point 

1. Consider in three-dimensional Euclidean space the motion of 
a rigid body with one fixed point O, which we take for the origin 
with the orthonormal basis ei, C 2 , e^. 

At the initial instant of time, let an arbitrary point of the body 
be at At, and during time t move to point Mt. Set 

OAl==x, OMt — y 

and denote by /(/) the transformation that associates with vector x 
a vector y: 


y = Ht)x 
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Geometrically, the motion of a rigid body means that every 
rectilinear segment formed by points of the body is carried, via 
the motion, into a rectilinear segment of the same length. 

It is therefore possible to construct a variable orthonormal 
basis eii, e 2 i, e^t moving together with the bqdy, in which the coor¬ 
dinates of the vector OM< preserve constant numerical values. 

For every fixed t, the change from basis e\, € 2 , to basis eu, 621 , 
63 , is specified by an orthogonal matrix. From the foregoing it fol¬ 
lows that the transformation /(/)jS linear and isometric for every t. 
Relative to the basis eu 62 , e 3 ,.ttiis transformation is written thus: 

yk ^ka 

Suppose the components /*□. (0 are differentiable functions of 
the time t. 

Let us find the distribution of linear velocities v of points of the 
body at an arbitrary time t. In other words, we wish to find v for 
every point Mi, that is, u as a function of y. For every point we 
have 

V = u,e| -f v^eo + 0363 = 

Thus 

I ^ ka W I 

a 

This can be written symbolically as 

V = I'fX 

where is a linear transformation whose matrix is obtained by 
differentiating the elements of matrix / with respect to the argu¬ 
ment t. Noting that 

x = r'y = }!/ 

we obtain the desired function as a linear transformation: 

V = I'Jy 

f • 

2 . Let us investigate the transformation A = hi in more detail. 

• 

We find the adjoint A. Take advantage of the fact that in an ortho- 
normal basis it suffices to take the transpose of the matrix in 
order to pass to the adjoint transformation, while taking the trans¬ 
pose of a product of two matrices is performed via the familiar 
formula: (///*)*=(/*)*{/,')*. Taking the transpose of a transpose 
returns the matrix to its initial form, and the differentiation of ele¬ 
ments of the matrix with respect to I is clearly commutative with 
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the transpose operation. We therefore have the matrix equations 

/!•=(/;/*)•=/(/•); ( 1 ) 

On the other hand, we have /(/)•/(/)= E for transformations 

and also the equation 

/(/)./*^) = £ ( 2 ) 

for their matrices, which hold identically with respect to /. It is 
possible to prove that a product of matrices can be differentiated 
by the same rule as a product of functions, and so from (2) we 
have 

(/n;=/;/* + /(n;=f;=-o o) 

From (1) and (3) it follows that 


A + A=-0 

We have thus established that the linear transformation 


v = Ay 

is skew-adjoint (^4 = — A). 

In Section 6 it was shown that a constant skew-adjoint trans¬ 
formation yields an instantaneous distribution of velocities of a 
body rotating with a constant angular velocity about a fixed axis 
and can be represented by the formula 

In the case at hand, the .transformation A and the vector a 
depend on the time. 

Conclusion. When a rigid body is in motion with one fixed 
point 0, the field of instantaneous linear velocities of its points is 
at every instant of time the same as if the body were in rotation 
about an axis with constant angular velocity, but the axis and the 
angular velocity depend on the choice of time. 

That is why, in mechanics, one speaks of the “rotation of a body 
about a fixed point” and not the “motion of a body with a fixed 
point”. 

The vector a = a(t) is called the angular velocity of instanta¬ 
neous rotation of the body. The straight line passing through O 
in the direction of a(/) is known as the instantaneous axis of ro¬ 
tation of the body. 

How the instantaneous axis of rotation changes with time is 
clearly seen in the case of a spinning .top. 
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§ 10. The curvature and torsion of a space curve 

1 . Given a curve S in three-dimensional Euclidean space. For 
the sake of pictorialness, we asume that S is the path of a point M 
moving with unit velocity, that is to say, during unit time it tra¬ 
verses an arc of S whose length is equal to unity. Under this 
condition, the time spent is numerically equal to the length of the 
path traversed. Denote the length by s and view it as an indepen¬ 
dent argument. 

Denote by t the vector of instantaneous velocity of Af. We can 
consider it to be a unit vector: 

(/./)=1 ( 1 ) 

It lies along a tangent to the curve S in the direction of motion 
and is called the tangent vector. 


( 

Fig. 63 


If the line is not straight, then t changes its direction in space. 
Therefore, the point Af experiences an acceleration equal to the 
derivative of the velocity vector, or t's. 

It can be proved that the scalar product of vectors is differentiat¬ 
ed by the same rule as a product of functions. Differentiating (1) 
with respect to s, we find that the acceleration is orthogonal to the 
velocity: 

{/. i): = 2(t. Q = 0 

Putting k{s) — \lU and assuming that fe(s)#0, we introduce 
the unit vector n, which is coincident in direction with t's (Fig. 63). 
Then 

t's = kn (2) 

k is called the curvature of the curve at the given point Al. By 
definition, ^ 0. Vector n is called the principal normal vector, 
while the plane passing through Af parallel to vectors t and n is 
termed the osculating plane of the space curve S at the point Af. 

Let us construe! the unit vector f) = [/ X ”)• H is perpendicular 
to the osculating plane and is called the binormal vector at the 
point Af (Fig. 63). 
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For every value of the argument s, the triple of unit vectors /, 
n, b forms an orthonormal basis that is connected in a natural 
manner with the geometric properties of the curve in the neigh¬ 
bourhood of M. The basis /, n, b is called a trihedral (or moving 
trihedral). 

2. Lay off the vectors of the movigg trihedral in space from the 
fixed point 0. Then as the argument s varies, the trihedral rotates 
about 0 as a rigid body. 

The velocity vector of instantaneous rotation of a moving tri¬ 
hedral is called the Darboux vector. Denote it by d = d(s). For 



an arbitrary vector u rigidly anchored to the moving trihedral we 
have, by the results of th^e preceding section, 


In particular 


Us = [dX «] 
ts = \dXt] 


(3) 

(4) 


Now let us see how the Darboux vector is located relative to 
the trihedral. We introduce the notation a={d, t) and from (4) 
find the projections of vector d on the directions n and b. To do so, 
substitute into (4) the expansion 

d = at Xn \ib 

with undetermined coefficients X, p. Noting that ts = kn, we get 

fen = a [/ X /] + A, (n X /I + t‘ I* X /) = - + tin 

whence X = 0, p = fe, and so 

d = 0l + kb 


XFig. 64). 


(5) 
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The function o = or(s) is called the torsion of the curve S. 
Putting (5) into (3), we obtain 

Us = a[tXu] + k[bXu] (6) 

Formula (6) shows that the instantaneous rotation of the 
trihedral can be decomposed into a sum of two rotational motions: 
about the tangent and about the binormal. The first component 
has an angular velocity equal to the torsion of the curve, the 
second, an angular velocity equal to the curvature of the curve. 
The angular velocity of the total rotation of the trihedral is \d\ = 

= ^/WTo'■. 

3. Putting the vectors n and b into (6) in place of u, we find 
the expansion of their derivatives with respect to the basis t, n, b. 
Together with (2) these expansions constitute the so-called Frenet 
formulas: 

t's = kn, 

ns = — kt ob, 

bs= —an 

which are important in the theory of curves. 

§ II. The decomposition of an arbitrary linear transformation into 
the product of a self-adjoint and an isometric transformation 

1. The purpose of this section is to represent any linear trans¬ 
formation in Euclidean space in the form of a composition of a self- 
adjoint transformation and an isometric transformation. 

2. Definition. A self-adjoint transformation A is said to be non¬ 
negative if (Ax, x)^ 0 for any x. 

Lemma 1. // a self-adjoint transformation is nonnegative, then 
all the roots of its characteristic polynomial are nonnegative. 

Remark. The fact that the transformation is self-adjoint is very 
essential, for then we are positive that all the characteristic roots 
are real. 

Proof. Let A. be a characteristic root and x the corresponding 
eigenvector. Then Ax = Kx and 

(Ax, x) = X(x, x) = k\\xf^0 

since (Ax, jf) ^ 0, whence A ^ 0. 

Lemma 2. If A is a nonnegative self-adjoint transformation, 
then there is a nonnegalive self-adjoint transformation B such that 
A = BB. 
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Remark. The transformalion B is called the square root of trans¬ 
formation A. 

Proof. Because A is self-adjoint, there is an orthonormal basis 
in which matrix A has diagonal form: 


I 


I 


A = 


0 



0 


By Lemma 
numbers. 

Relative to 
the matrix 


1, all Xi ^ 0 and so are real nonnegative 

the same basis, determine the transformation B by 


V^i 

0 

0 

V 


The basis is orthonormal, matrix B is diagonal, and so trans- 

« 

formation B is self-adjoint (B — B). From the formula for B it 
is clear that for the matrices we have the relation BB = A and so 
the same relation holds for the transformations. Write out y = Bx 
in terms of coordinates: 

f r/i = V^i^i. 


lr/n= ^/KXn 

whence 

(Bx, x) = (y, x) = x,y, -f ... + = ^X, x] + • • • + ^ 0 

so that the transformation Bx is nonnegative. Lemma 2 is proved. 

Remark. Consider the scalar product (Ax,x). If this quadratic 
form is positive definite, then the transformation A is said to be 
positive definite or positive. In that case A is nonsingular and all 
its characteristic roots are positive. From the proof of Lemma 2 it 
is evident that the square root of a positive transformation is also 
a positive transformation. 


3. In this section we will denote an adjoint transformation by 
an asterisk placed at the side of the symbol (as in the case of the 
transpose of a matrix) and not on top. Thus A* is the adjoint 
of A. This symbolism is more convenient in computations but we 
must bear in mind that if the same letter A is used to denote the 
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matrix of the transformation A, then the transpose A* of the matrix 
will, generally speaking, be the matrix of the adjoint transforma¬ 
tion A* only relative to an orthonormal basis. 

Later on we will need identities that hold both for matrices and 
for transformations: 

(1) (AB)* B*A*, 

(2) {A*)* = A, 

(3) (AB)-' = B-'A-', 

(4) (/I*)-' 

We first have to verify the truth of these formulas for matrices 
(see Sections 2, 3, Chapter II) and then consider transformations 
in an orthonormal basis. Relative to such a basis, the above for¬ 
mulas for transformations follow immediately from the matrix 
equations. 


4. Theorem. For every nonsingular linear transformation A in 
n-dimensional Euclidean space £„ there exists a self-adjoint trans¬ 
formation B and an isometric transformation f such that 

A = fB (I) 

Remark. Similarly, there exist a self-adjoint transformation B\ 
and an isometric transformation l\ such that A ^ Bi/j. 

Proof of the theorem. Consider the transformation A*A. It is 
self-adjoint: 

{A'Ay = A*(Ay = A'A 


It is nonsingular since A is nonsingular by hypothesis. Besides, 
the transformation A*A is positive definite: 

{A'Ax, x) = {Ax, A.X) = II Ax |p > 0 

if a: Tpfc 6. By Lemma 2 we can take the square root of A*A: 

V^=B, A*A = BB 


where B is a positive definite self-adjoint transformation. And so 


Putting 


A = {AT'BB 
[ = {AT'B 


we get the following representation for A: 

A^fB 


It remains to prove that I is an isometric transformation. To do 
so, compute /* using the self-adjointness of B: 

r = B'({AT'y = BA~^ 
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S 12) 

Furthermore we have 

/•/= BA~' {AX'B = B{A*A)~'B 


By the construction of the transformation B, 
so that * 

ri = bb~'b''b = e 

whence follows the isometric nature of the transformation /. The 
proof is complete. 

5. We shall call the self-adjoint transformation B in the ex¬ 
pansion (1) the essential part of transformation A. 

§ 12. Applications to the theory of elasticity. 

The strain tensor and the stress tensor 

1 . Consider a continuous elastic medium. Fix a point O in it 
and take a certain volume containing the point. Suppose that in 
the absence of external forces this volume is a sphere U with 
centre at O and that under the action of external forces it is de¬ 
formed and displaced. However, if we disregard parallel displace¬ 
ment in space, we can regard 0 as being fixed. An arbitrary 
point M in the sphere U is described by the vector x = OM. Sup¬ 
pose that as a result of deformation, M moves to position M'. Put 
OM' = y. Experiment shows that 

y = Cx + r(x) (1) 

where C is a nonsingular linear transformation with positive de¬ 
terminant, and for small x the vector r = r{x) is an infinitesimal 
of higher order, that is, 

lim -^^ = 0 (2) 

ui-»o I I 

If U is small, then the vector r may be disregarded, and then in 
place of the transformation (1) we can consider the linear trans¬ 
formation 

r/ = Cv (3) 

Remark. If condition (2) is observed, the linear transforma¬ 
tion (3) is called the dilferential of the nonlinear transforma¬ 
tion (1). 
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Generally, for a given deformation of the elastic medium, the 
linear transformation C depends on the choice of the point O. 

2. By the theorem proved in the preceding section, the linear 
transformation C may be decomposed into two factors: 

C = /C 

where C is a self-adjoint transformation (the essential part of C) 
and / is an isometric transformation with det / = + 1. 




The transformation C characterizes the deformation of the elastic 
medium near the point 0. It constitutes a compression along three 
mutually perpendicular directions and therefore carries the sphere 
U into an ellipsoid V (Fig. 65). 

The transformation / characterizes a rotation of F as a rigid 
body about the point O (Fig. 66). 

In most actually encountered cases the ellipsoid V differs but 
slightly from the sphere U. For example, in metals under loads 
that do not go beyond the limits of elastic deformations, the semi¬ 
axes of the ellipsoid V ordinarily differ from the radius of the 
sphere U by only a fraction of a per cent. For this reason, the 
transformation C is close to the unit transformation E and is 
represented as a sum: 

C = E + B 

Here, B (as can readily be demonstrated) is also a self-adjoint 
transformation. 

Let e\, 62 , 63 be an orthonormal basis relative to which we write 
the matrix B = II &,j li- 

The doubly covariant tensor is called the strain tensor. 

.3 

The (iiiaiitity 0 — ^ ftn (the trace of the operator B) is called 

the coefficient of volume change. This name is due to the fact 
that the eigeiiveclors of operator B are small and the ratio of the 
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volume of (he ellipsoid V to the volume of the sphere U is ap¬ 
proximately equal to 1 + 0 (to within a quantity of the order of 
the square of the eigenvalues of operator B). 

3. When an elastic body undergoes deformation, stresses arise. 
Through point 0 pass a plane oriented by the unit normal n and 
consider small parts of the elastio*medium adjoining this plane 


Fig. 67 



near 0. We replace the actual interaction of these parts by the 
forces applied to them. 

The force of action of one part of an elastic body on another 
referred to unit cross-sectional area is called the stress at the 
given point 0 for a given orientation n and is denoted by p 
(Fig. 67). The dependence of p on the direction « at 0 can be ex¬ 
pressed to a high degree of accuracy by the following type of for¬ 
mula: 

p = F(n) 

where f = II fa II is a linear transformation that in general is de¬ 
pendent on the choice of point 0. 

The tensor /,j is called the stress tensor. 

4. If the elastic body is homogeneous and isotropic, that is, if 
it has identical mechanical properties at all points and in all di¬ 
rections, and the deformations are small, then the relation between 
the strain tensor and the stress tensor is given by Hooke’s law: 

F = XQE -j- 2p.6 


or, in coordinates, 

ftk — A.06<(j + 

Here, I and p, are constants that describe the mechanical proper¬ 
ties of the elastic medium and 6 is the coefficient of volume change 
(see Subsection 2). 
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§ I. Alternation 

1. Let a, b, c, ... be arbitrary vectors of a given linear space L. 
We will consider products of vectors in L, regarding these pro¬ 
ducts as contravariant tensors, which is to say, as elements of 
linear spaces Ti, li, etc. Then L itself is To (see Chapter V), 
Thus, for example, we have a e tI, ab e To, abc e tI, and so on. 

We use the term alternation of a product of vectors for an ope¬ 
ration which is denoted by square brackets and is defined by the 
following equations: 

[a] = a, 

[ab\ = ^{ab — ba), 

[abc\ = (abc -f bca + cab — bac — acb — cba) 


For any number of arbitrary vectors ai, 02 . am^ L v/e set 


• • • fl/nl — 


m! 



CL^df^ 




0,02 



where means the sum of all products obtained from the pro¬ 
duct 01 O 2 ... a,n for even permutations of the indices 1, 2, ..., m; 

Xi has a similar meaning for odd permutations. We can write this 
2 

differently as 




( 1 ) 


where the sum on the right is taken over all indices /,, / 2 , ..., /m. 
each of which independently runs through all values from 1 to m; 

^ 12 ^m" ~ iji • " im 's an even permutation of the m-tuple 
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1,2. m\ = — 1 if hh-jm is an odd permutation 

of the m-tuple 1, 2, m\ there are two identical 

values among the values /i, / 2 . .... 

2. The alternation of a product of vectors possesses the follow¬ 
ing properties: 

(1) Linearity in any factor; for exffinple, in the first factor 
[(aa; + pa;')n2 • ■ • «„,] = a [«;<'2 • • • «m] + P[«r«2 • • • «m] 

(2) Skew symmetry with respect to any pair of factors; for 
instance, the first pair: 

[a^a^a^ ... a„] = — [a^aiOa ... aJ 


These properties are immediately evident from the definition of 
an alternation and so we give no proof here. 

3. If there are two identical vectors from among ai, 02 , ..., Om, 
then [aia 2 • • • Om] = 0. Here the symbol 0 denotes the zero tensor 
of the space T.?. The assertion is clear since under an interchange 
of identical vectors the alternation [0102 ... flm] does not change, 
but the sign does. 

4. If the vectors Ci, 02 , ..., are linearly dependent, then 
[ 0)02 ... Urn] = 0. Indeed, assume for the sake of simplicity that 
ai, ..., Ok is the maximum independent subsystem of the system 
of vectors Cj,..., a,„; we then have a^+i = aifli -f ... + cckOk. But 
then from this, via the property of linearity and due to Subsec¬ 
tion 3, we obtain 

[a, ... am] = a, [a, ... 0 * 0 , ... 0^1 -f 

... 4 - a* [fli ... = a, . 0 -f ... -f a* . 0 = 0 

5. Furthermore we assume that the given space L is n-dimen- 
sional. 

6. Then if m > n it follows that [0102 ... a„i] = 0 (this asser¬ 
tion follows directly fiom Subsection 4). 

7. Let X be an arbitrary tensor in Tn. By the definition of .To, 
the tensor x is a sum of products of certain vectors of L contain¬ 
ing k vectorial factors in each summand. We accordingly write 


X — o’/'oy ... ay* + ... + 0',*''^''** ... ay*'* 


( 2 ) 
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where a'/'e L. We use the term alternation of a tensor 

for a tensor of the same order that is denoted by [a:] ([x] e To) 
and is defined by the equation 

[x] = [o',"ay ... o',"] + ... + [oWaW ... ay»] (3) 

8. The alternation of a tensor is not dependent on the mode of 
notation in the form (2). In other words, if x' = x, then [x'] = [jc]. 
Thus, let 

/= b \%'^... 6'y + ... + ... bT' f4) 


then by our definition 

[/] = [b\%'^ ... + ... + [b^bi^^ ... C’J (5) 

Suppose x" = X. This means that the sum (2) reduces to the 
sum (4) by means of admissible replacements (see Chapter V). 
But by the linearity of an alternation, to each admissible replace¬ 
ment in (2) there corresponds precisely the same one in (3). There¬ 
fore, (3) reduces to (5) via the same admissible replacements that 
reduce (2) to (4). Thus, M = fjt]. 


9. We have the identity 

[[0,02 • • • «ml] = (0,02 ... aj (6) 

which states that two successive alternations of a product of 
vectors return it to the original alternation. 

Let us convince ourselves that (6) holds in two elementary 
ci^ses: m = 1 and m = 2. Here the identity is obvious: 

[k|] = [ai], 

[[ 0 , 02 ]] = ^ ([ 0 , 02 ]—[020,]) = -^ (-^ — mi) — imi—mi)) 

( 0,02 — fl2®l) ~ [^ 1 ^ 2 ] 

In the general case, we have, according to (1) and (3), 

[[OjOj ... a„]] = ^ 6(' [o^^ ... o^^] (7) 

But it is easy to see that in the sum on the right, all terms are 
the same and each of them is equal to [ 0,02 ... Om]. Indeed, by 
skew symmetry with respect to any pair of upper indices of the 
quantity ; and because of skew symmetry with respect to 
any pair of indices of the alternation [oy^ ... o/^], in each term of 
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the sum (7) we can interchange any pair of indices in the m-tup!e 
/i • • • /m. without altering the term. This means that without alter¬ 
ing any of the terms we can arrange all indices in natural order. 
Since sjgm= 1, it follows that every term reduces to [0102 ... Am]. 
Since there are m\ terms in the sum (7), it follows that (7) im¬ 
plies (6). 

Remark. It is now clear why it is advantageous, in the defini¬ 
tion of an alternation, to take not rndfely the algebraic sum of the 
products of the vectors but this sum divided by ml. 

10. Due to (.6) we have, for any tensor jce To, 

[M] = W 

11. If /j > n, then for any .ve To we have [a:] = 0 (see Subsec¬ 
tion 6). 

12. Let C], ^21 ■■■, e„ he a basis in /.. Then, as we know, all 
products e. ;.. constitute a basis in 7* and for every 

tensor x ^ To we have the expansion 

x=Zx''‘^'" 

From this we obtain an expression for the alternation of tensor x 
relative to the given (arbitrary) basis: 

= ( 8 ) 

13. From now on we will use the symbol ; as having the 

following meaning. We will assume that the indices i .. 4, 

/i, ..., jk take on any values from I to n (where n is the dimen¬ 
sion of L). The number of all lotver (or upper) indices, the num¬ 
ber k, may be arbitrary. If all numerical values i\, ..., 4 are 
distinct, then dj' ■; = + 1 when /i ... /* is an even permutation 

of the.set of numbers t|, ..., 4; 6-'= — 1 if /i • • • /a is an odd 
permutation of the numbers ii, ..., 4. In all other cases, 6|| (*= 

= 0. Thus, 6V; /* = 0if among the numerical values iu ..., 4 

(or /i./ft) there are two identical ones or if among the nu¬ 

merical values /i,..., /ft there is one that is absent from the set 
t'l,..., 4 (and conversely). In particular, 6(| ;; (* — 0 when k > n. 

According to the definition just given, the set of numbers 
possesses skew symmetry both in the upper and -in- the lower 
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indices. In other words, interchanging two upper indices or two 
lower indices changes the sign of 

14 . It is easy to see that (I) implies the equation 



where the indices /|, .... 4 are fixed and the sum on the right 

is taken over all values of the indices /. 4 ^rom the 

n-tuple 1 to n. Hence, the right member of (9) is a resolution 
of the tensor ... relative to a basis in the space f*; the 

numbers (for fixed i .4) are the components of 

this tensor. 


15 . Rewrite (8) with the aid of (9)s 
1 




We now introduce the notation 






In other words, we put 
xl'i = x', 

= -1- - X»^ - X*^') 


10 ) 


By (10) we get 

M = ... (11) 

The operation defined by (10) is called the alternation of the 

components of tensor x, or the alternation of the indices /i.4. 

This was discussed in Chapter V. 

Comparing (8) and (11), we see that in order to obtain an al¬ 
ternation of tensor x we can proceed in one of two ways: 

(1) either replace, in the expansion of tensor x, all products 
ol the basis vectors ei^ ... by their alternations 

leaving the old coefficients x*' ''*; 

(2) or replace the coefficients by the corresponding 

alternations ‘ '*1 leaving unchanged the products of the basis 
vectors ... For the sake of pictorialness we demonstrate 
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the equivalence of these two operations in the case of k = 2. 
If jc = 2 then 

[x] = x‘^ [e,e,\ = {ete, — 

= 4“ E = Z 

16. Equation (11) is an expansion of the tensor [x] in terms of 

a basis in To. Hence the alternations of the components 

of tensor x are the components of its alternation [x], 

17. From (10) it follows that the alternations of the 

components of an arbitrary tensor x e if, possess skew symmetry 
in any pair of indices. 

18. Definition. A tensor x(x e? Tn) is said to be skew if [x] = x. 
From this definition and from Subsection 16 it follows that the 

equation 

= ( 12 ) 

holds true for the components of a skew tensor. 

Thus, the components of a skew tensor have skew symmetry in 
any pair of indices. Conversely, if the components x'' ” '* of any 
tensor xe To have skew symmetry, then from (10) we get (12) 
(via the very same arguments used to prove the identity (6) in 
Subsection 9). Whence [x] = x. 

Thus, the definition of a skew tensor x via the condition [x] = x 
is equivalent to the definition via the property of the skew sym¬ 
metry of the components (see Chapter V, Section 8). 

Remark. Since the equation [x] = x holds for any tensor x in To, 
it follows that all first-order tensors are skew. 

§ 2. Multivectors. Outer product 

1. Let us consider a set whose elements are all contravariant 
skew tensors of all possible orders (including the first order) spe¬ 
cified over an n-dimensional linear space L. 

Definition. The outer (or alternate) product of the skew tensors 
X, y, X e 7^, ye t‘, is a tensor of the space To'*''. This is denoted 
by the symbol x Ay and is expressed by the formula 

xAy = -jfii-[xy] 


( 1 ) 
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The square brackets denote the alternation of an ordinary product 
of tensors xy. 

In the particular case of two arbitrary contravariant vectors x, 
1 / e L we have 

X A y = xy — yx 

2. An alternation always yields a skew tensor and therefore 
outer multiplication does not take us outside the set S. It is easy 
to verify that a collection of skew tensors of a given order m forms 
a subspace in T^. We denote it by G'”. Hence the set 9 is closed 
under addition of tensors and their multiplication by scalars. 

Nonzero tensors in 9 are considered equal elements of this set 
if and only if they belong to a single space of T" and are equal 
as elements of Tq. Besides, 9 includes zero elements of the spaces 
7’"' for all natural m. Zero tensors of all orders are considered to 
be equal elements of the set We will denote them by the sym¬ 
bol 0. Clearly, 

v A0 = 0 

for any x^9. By Subsection 6, Section 1, we have xAy = 0 
if X ^ To, y ^ Tq, k 1 > n. 

3. We will use the term Grassmann algebra * over the space L 
to denote the set 9 with the aforementioned equality of elements 
and the defined operations, in this set, of outer multiplication, mul¬ 
tiplication by a scalar, and also addition of tensors of the same 
order. 

Contravariant ^-order skew tensors viewed as elements of Gras¬ 
smann algebra are called contravariant fe-vectors or contravariant 
multivectors. The number k is called the order of the multivector. 
The element 0 is called the zero multivector. To the order of the 
zero multivector we can assign any natural value. All multivectors 


• Tlii.s diTinition of a Grassmann algebra is inadequate. The point is that 
in sets called algebras the operation of addition is defined for all elements. 
Therefore we should also have defined addition of tensors of different orders 
in the set S'. This is done by constructing symbolic sums of elements of dilTe- 
rent spaces T^. Besides, one ordinarily includes in S tensors of order zero, that 
is to say, scalars (invariants). As a result, S becomes a linear space of dimen¬ 
sion 2" isomorphic to a direct sum of subspaces G"‘: 

^ = G»©0'© ... ©G" 

where C° denotes the collection of tensors of order zero. We do not carry out 
this construction in detail since we will not be dealing with the addition of ten¬ 
sors of dilTercnt orders. Also observe that the set S is often called a Grassmann 
algebra over the space L*. which is conjugate to the given space L. This is be¬ 
cause the elements of ^ may be identified with multilinear forms whose arguments 
are vectors of L*. A general definition of Grassmann algebra is given in [2], 
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of order k > n are equal to the zero multivector. Thus, all nonzero 
multivectors belong to the spaces To, To, .. T'i. 

4. Elementary properties of outer multiplication. 

(1) {ax) A y = X A {ay) = a{x A y) fur any scalar a and any 
X, y^$, since a numerical factor can be taken outside the sign 
of ordinary multiplication of tensofs and also outside the sign of 
alternation. 

(2) {x-\-y)Az = xAz-{-yAz, since both ordinary multipli¬ 
cation and alternation of a product of tensors are distributive un¬ 
der addition. 

(3) Outer multiplication is skew-commutative, namely, 

xAy=-{-\)'‘‘yAx (2) 

if x^ To, y ^ To. 

Proof. In coordinate (component) notation we have 


x — 'Zx"' ... eti^, 

whence 


xAy = 

(k + i)\ 
ki i\ 


•• V ‘■■•''[^4 • 

• • 54^/1 • 


(3) 

Similarly 







yAx = 

(* + /)i 

k\ /I 


•• ■ 

• . e,^ei^ . 

• • ^4] 

(4) 


Here, ei, ..., en is a basis in L, the indices t'l, ..., 4, ju ■ ■ •, jt run 
from 1 to n independently, the summation is over all indices, and 
the square brackets denote alternation, as above. To carry the per¬ 
mutation of indices (t'l.4, /i, ■ • •, //) into the permutation 

{ju • • •. //. fi. • • ■. 4) requires Id interchanges of adjacent indices. 
Therefore and by virtue of the definition of an alternation, each 
term in the sum (4) differs from the corresponding one in (3) by 
the factor (—I)'*', whence follows (2). 

Corollary. If x is a multivector of odd order, then x A x = 0. 

(4) An outer product is associative, that is, for any multivec¬ 
tors X, y, z we have 

{x A y) A z = X A {y A z) 

To prove this identity we will need some preliminary results. 

5. Consider two arbitrary sets of basis vectors: et^ . 

and ^/,, ..., e/j. Take their alternations to get two multivectors: 
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Multiply them by the ordinary rule for multiplying tensors and 
take the alternation of the resulting product: 


[[^ 




= —!_y - - “aaPi--- PiTp 

k\n ... r“i • 


“4 P| 




To start with, suppose that there are no identical indices among 
I’l, .... 4 and among ju .... //. Then it will suffice, in the sum (5), 
to consider only those terms where ai ... ots is a permutation of 
the set of indices t’l, ..., 4, and pi ... P; is a permutation of 

. .. ji (the remaining terms are equal to zero). But it is obvious 

that all such terms are identical and for any one of them we have 


M M *•* h L ®l 

= 6'' -;>6(> •••(' re 

... tf^ /| ••• /;L 






e. e, 
'k h 




= \e, 





It is also obvious that the total number of such terms is equal 
to k\ l\. Consequently 


[[e/, ... ei J [e/, ... e/J] = [ei, ... ... e/J (6) 


It is now clear that (6) holds true at all times because if there 
are identical indices among ii, ..., 4 or /i, ..., /;, then both mem¬ 
bers are equal to the zero multivector. 


6. From (6) follows immediately the associative property for 
an outer product of basis vectors. Namely, we have 

eiAef-=2\[eiei] 

Furthermore 

{ci A c/) A {ei A e,, e*] = 3! e*] == 3! 


Similarly 

Ci A (e, A Ci) = 3! [e, feyCi]] = 3! [ 616 , 6 ^] 

Thus, {ei A Cy) A c^ = ey A (cy A Ci) and Ei A cy A Ci = [ei A ej) A Ci = 
= C| A (cy A Ci) is defined. From this and by induction (using (6)), 
we get 

Ci.Ac/^A ... ACi^ = m![ei,Cij ... ^ij (7) 

The outer product of many basis vectors (on the left) may be de¬ 
termined, as is usual in such cases, via any successive combination 
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of factors. From (1), (6) and (7) il also follows lhal 
... A A (o, Ae/^ A ... Aei^) 

= ei^Aei,,A ... A A e/, A A ... Ae/, (8) 

7. Let X be an arbitrary muH^ivoctor in To: 

,v==X'.'''‘ e/ft (9) 

By the definition of a mullivector we have x = M and, conse¬ 
quently, we can write 

x = Zx‘' -‘'‘[ei^ ... 
whence and due to (7) 

= ■■■'*ei, Ae,^ A ... A e,^ (10) 

The summation here is over all indices, each one running indepen¬ 
dently from 1 to n. Terms of this sum where there is at least one 
pair of identical indices are equal to zero. We now consider all 
terms corresponding to all possible permutations of some single 
set of distinct indices. There are k\ such terms for a given set of 
indices and they are all equal. And so in place of (10) we can 
write 

x = *YjX'"' ... A e/ft (11) 

where the starred sigma is the summation sign over all sets of 
indices o, ..., 4 provided i\ < h < .. ■ < 4. 

8. As we know, all multiveclors of a given order k form in To 

a subspace denoted by G*. From (11) follows an important result; 
the outer products ei^A ... A e/^ of the basis vectors of space L 
which correspond to all possible sets of indices it <. h < < 4 

constitute a basis in Gft. 

Indeed, by (11) every multivector x e G/, can be expanded in 
terms of outer products ei^A ... Aet^{ii< ... < 4). On the 
other hand, it is easy to see that these outer products are linearly 
independent in Gft. Suppose there are numbers '4 where 
4 < 4 < ... < 4, for which (11) yields x = 0. We determine 
x'l ••• 'ft for any arrangements of the indices 4, ..., 4 by the con¬ 
dition of skew symmetry; in other words we put x-'’ •••* = 
__j^i 2 s...fc ar,(j 5 Q Qp Then in place of (11) we can write the 
equivalent expansion (9), where the summation is over all in¬ 
dices and the coefficients ■■■ 4 are the components of the ten- 
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sor X. We have x = 0; hence x‘' ^’‘ — 0 as components of the 

zero tensor. This proves the linear independence of the outer pro¬ 
ducts 

ei^Aci^A ... A (/, < io < ... < ik) 

9. Corollary. The dimension of the subspace G* is equal to Cn. 
True enough, for the number of all outer products ej^ A ef.. A ... 

... Aei^, I'l < i 2 < ... < ik, is equal to the number of combina¬ 
tions of n elements taken k at a time, which is to say, C^. 

10. Let us now return to the problem of the associative property 
of an outer product of any multivectors. The proof of this property 
follows from identity (8). To simplify the proof, we introduce the 
notation 

^/,... = A e/j A ... A 

Then (8) becomes 

e,^...,^Ae,^ ..,/,= ei,...y,.../, (12) 

Consider arbitrary multivectors jcsTJ, y^Tf,. We can write 
them as 

jr = * Z Jc'* ■■■ ... ift. y=*'L y'^ ■■■ ‘‘ei ^... /, 

From this and also from (12) we have 

X Ay = *Y,x''" V' ■■■ ... ifc/,... /, (13) 

Here, the starred sigma is the sign of summation over all sets 

of indices I'l.4 and ... // provided 4 < 4 < ... < 4 and 

/i < /2 < ... < //. However, in the general set 4, ..., 4, /i, ..., /i 
these indices may not be arranged in increasing order. 

Let z e r™ be another multivector: 

z = (14) 


Again using (12), from (13) and (14) we find 


(.V A //) A 2 = * Z x^' " 

• V>- 


'"ei,.. 

•44- 

.liAes,.. 

= * Z 

•• V 

•• V.- 


..ift/,. 

.. /jS, ... 

On the other hand, 






X A{y Az) = *Y,x''' 

•• V 

•• V 

■■"'"Cl,. 

..ifcAe/, .../,s,. 

= * Z •'f'* 

V' • 

•• V. • 


••44- 

.. /jSj ... 


( 16 ) 
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The symbol in (15) and (16) is used in the same sense as 
in (13). Comparing (15) and (16), we obtain {xAy)Az = 
= X A z), and the associative property is proved. 

II. In the usual manner we now dermo the outer product of any 
number of any contravariant multivcctors: xAyAzA...Aw. 
For example, • 

X Ay Az At —{{x Ay) A z) At = {x A //) A (^ A 0 = -v A (// A (2 A 0 ) 

§ 3. Bivectors 

1. A bivector of a space L is a multivector of order two. As 
before, we deal in contravariant multivectors. Therefore, when 
speaking of bivectors, we have in view multivectors in 7^. 

2 . A bivector p is called a simple bivector if it is equal to the 
outer product of two vectors: 

p = ai A ^2 ( 1 ) 

where Ci, 02 e L. If a\, 02 are linearly dependent, then p = 0. If 
Oi, 02 are independent, then p ¥= 0. True, because if oi, 02 are in¬ 
dependent, they can be completed to form a basis Ou 02 , 03 , , On. 

But then the outer products a, A Oj, i < /, constitute a basis in the 
subspace of bivectors over L (see Subsection 8 , Section 2). Con¬ 
sequently, none of these outer products, including a\ A 02 , can be 
a zero bivector. 

Thus, the simple bivector (1) is zero if and only if the vectors 
Ou 0,2 are linearly dependent. 

3. Suppose p # 0 and, hence, fli, 02 are independent. Then Oi, 02 
define in Z. a two-dimensional' subspace L 2 which is their linear 
hull: L 2 — L(ai, 02 ). Let bu 62 be any two vectors in L 2 . Due to 
the independence of a\ and 02 , we have 

61 = anQi + 01202, 

62 02|0| ”1“ O22O2 J 

where a,j are numerical coefficients. Consider the bivecter 

q = btAh.j 

From (2) and (3) we get, by the rules of outer multiplication, 
q=bi A b 2 = (aiiai + 0,202) A (a 2 i«i A 02202) 

= o,,022(0, A 02) -f 0,2021 (02 A fli) = (011022 — 0,2021) (fli A 02) 


Thus 


q = Dp, D = 


Oil Oi2 

021 O22 


(4) 
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That is, the bivector q is proportional to the bivector p and the 
constant of proportionality D = det || a,j ||. 

Conversely, suppose that q = b\ = Xp, where X is a num¬ 
ber ^ 0). Then bu &2 s ^ 2 , that is, we have equations (2), and 
\ = D from (4). 

Let us prove this assertion. 

We consider three arbitrary vectors Cu C 2 , C 3 . If they are depen¬ 
dent, then 

C| A C 2 A C 3 = 0 (a) 

(see Subsection 4 of Section 1 and (7) of Section 2). If C|, C 2 , C 3 
are independent, then C| A C 2 A Ca = 7 ^ 0. Indeed, the independent 
vectors ci, C2, C3 may be completed to the basis Ci, C2, C3, ..., c„. 
But then all the outer products of the type cj.Aci A Ci„ fi < ii < h, 
will constitute a basis in G3. Hence they are all different from 
zero. Thus, equation (a) is necessary and sufficient for the de¬ 
pendence of vectors Cu C 2 , C3. Now let 61 A 62 = A,(ai A 02 ), where 
k 0, Oi A 02 = 7 ^ 0. Form the outer product of both members 
by b\. On the left we get 6 | A 6 | A ^2 = 0, and so A Oi A 02 = 0 . 
From this, by what has already been said about the arbitrary vec¬ 
tors cs. C2, C3, we conclude that b\, a\, 02 are dependent. This means 
b] e L 2 = L(ai, 02 ). Similarly b 2 ^ L 2 = L{a,, 02 ). It is then clear 
that Xz= D. 

4. To summarize, if Oi A 02 = 5 ^ 0, then 

bi A f *2 ^ (^1 A O 2 ), A, ^ 0 (5) 

if and only if bi, 62 are independent and 61 , {>2 s Z ,2 = L(au 02 ). 
Also, X = det II a,-,-1|, where II a,-; II is a matrix composed of the 
components of the vectors bi, &2 relative to the basis oi, 02 . 

5. In particular 

bi A Ih = 0 , A o> 


if and only if &i, 62 ^ f ^2 == L(a\, 02 ) and 

del II a,,11=1 

6 . The subspace L 2 = L{a\, 02 ) is called the subspace of the 
bivector Oi A 02 . We say that the bivector Oi A 02 lies in the sub¬ 
space L 2 . We also say that a\ A 02 is the direction bivector of this 
subspace (just like an ordinary vector lying on a straight line is 
said to be the direction vector of that line). 

7. Suppose a linear space L is equipped with a Euclidean metric. 
For the sake of pictorialness, imagine the nonzero bivector bi A bz 
in the form of an oriented parallelogram constructed on an ordered 
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pair of vectors b], b 2 (Fig. 68 ). Ttie area of this parallelogram, will 
be called the area of the bivector bi A 62- A bivector with area 
unity is termed a unit bivector. 

Now let fl] A 02 be a unit bivcctor. Then det || aij || = ± a, 
where o is the area of the bivector bi A 62 , and relation (5) be¬ 
comes 

&i A 62 = *F o(o| A 02) ( 6 ) 

The signs dr correspond to cases Where the bivector bi A 62 is 
oriented in L 2 like the bivector Oi A 02 or has the opposite orienta¬ 
tion. 



If we assume that the subspace L 2 itself is oriented by an or¬ 
dered pair of vectors Ci, 02 , then in place of ( 6 ) we can write 

b\ A bo — S {a\ A 02 ) (7) 

where S = det || a,-; II is the oriented area of the bivector A & 2 - 

8 . Thus, simple bivectors lying in Li are depicted in the form 
of oriented parallelograms of the subspace Z- 2 - By ( 6 ) or (7), 



Fig. 69 


parallelograms of equal area and identical orientation in Z -2 depict 
•the same bivector (Fig. 69). 

9. Assuming Z- to be a Euclidean space, take in L an ortho¬ 
normal basis ei, , e„. Consider arbitrary vectors bi, bi e L. We 
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have 

/?, = xje, + x^e, + ... + x"e^, 

^2 ~ ^2^1 "t" -'■2^2 + • • • “t" ^2^n 

Choose a pair of distinct basis vectors e,-, ej and assume i < /. 
They define a two-dimensional coordinate plane, which we denote 
by Eij (to be more exact, we should say that Efj is a two-dimen¬ 
sional subspace, namely L{ei, ej)). We consider that the plane is 



oriented by the bivector Cj A ej, i < /. We use the term projection 
of the bivector bi A 62 on £,j for the bivector 

(x\c, + x<e^) A (x'e, + x!^e,) = S‘le, A e, 

where S'^ = xjx/— a£;(x' is the oriented area of a parallelogram 
constructed in Etj on the vectors x'e. + xle^ and -f- that 
is, on the projections of 61 and 62 - Also, by the rules of outer 
multiplication we find 

f»i A ^2 = * Z S‘'ei A Bf (8) 

where, as before, the starred sigma signifies summation with the 
proviso that i < /. 

From ( 8 ) we conclude that in an orthonormal basis the compo¬ 
nents of the simple bivector b\ A bj are oriented areas 5*^ of its 
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projections on the two-dimensional coordinate planes Etj, i < j 
(see Fig. 70, where n = 3). 


10. In the three-dimensional case, the basis consists of three vec¬ 
tors Cl, 62 , ez and the sum ( 8 ) also has three terms; 

6, A 62 = S'2(e, A ez) + A Cz) + A Cz) 

Therefore, with each bivector b\ A bz of three-dimensional Eucli¬ 
dean space L we can associate a vector c of the same space L, 
putting 

c = - S'^ez -f S'^ez 


The vector c is determined by the bivector bi A bz invariantly 
with respect to changes to other orthonormal bases with the same 



orientation as that of the basis ei, ez, ez- It is easy to see that the 
vector c is the vector product of b\ by bz (Fig. 71): 

c = [ 6 i X bz] 

II. Let us take a look at arbitrary bivectors (not necessarily 
simple) in n-dimensional linear space L. For the time being we do 

not assume a Euclidean metric in L. Let . .. e„ be a basis 

in L. Then e\ /\ez, ..., ^,,-1 A t>„ constitute a basis in the space 
of bivectors over L. For any bivector 11 we have the expansion: 

M = * Z Aej = u'^e^ A e.^ -f A (’3 + ... -f «'"ei A e„ 

+ u^ez A ^3 + ... -f u^'^Cz Ae„+ ... + "en-i A (9) 

Thus, every bivector decomposes into a sum of simple bivectors. 


12. Due to the relations e,- A ej = etCj — ejet, we can replace (9) 
with an expansion of u relative to a basis in 7 ’ii: 

« = Z u‘'eiei 

= 0 • 6 ,6, -f u''^6,e2 -f ... -f u''‘e,e„ 

4 - 11^^626, + 0 - e.j6z+ ... + u^'^ezen 


-j-u'>'e„e,+ -j-u'^-‘"e„-,e„ + 0 -e„e„ ( 10 ) 
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Here, for i > / we have u*-’ = —The diagonal contains zero 
components (with coefficients «” = 0). 

13. The rank of the matrix || || composed of the coefficients of 

expansion (10), that is, of the components of the bivector u rela¬ 
tive to the basis is called the rank of the bivector u. We de¬ 

note the rank by r(r = rank || ll). 

14. The rank of the bivector u is also the rank of the bilinear 

form u(l, T]) = X with covariant arguments |=(|i. 

T) = (rji, ..., Tin), I, T) e L*. This form is a complete contraction 
and thus an invariant, and so its rank is invariant, whence fol¬ 
lows the invariance of the rank of the bivector, that is, the rank 
is independent of any choice of basis ei .e„ e L. 

15. Let us consider a linear operator U, which associates with 
an arbitrary vector | in L* its right-hand contraction with the bi¬ 
vector u: 

x = Ul = (u,l)^ L 

Let us write down the transformation U in terms of components: 

x‘ = Zu% (II) 

where x' are the components of vector x relative to any basis 
«!, .... e„ of space L, are the components of | relative to the 
reciprocal basis e', ..., e" of space L*. The matrix of transforma¬ 
tion U coincides with the matrix || u'J || of the bivector u. Therefore 
the rank of U (defined as the rank of its matrix) coincides with 
the rank of the bivector u (rank U = r). 

Let Lr — U{L*) be the image of the entire space L*. By Sec¬ 
tion 3, Chapler VII, Lr is a subspace of dimension r in the space L. 
The subspace L, will he called the rank subspace of the bivector u. 

Suppose that the basis vectors e\, ..., er are chosen in L,-. Then 
they form a basis in Lr and every vector x = Ul can be decom¬ 
posed in terms of the vectors e,, ..., er. In other words, in this 
case we have 

x'+'= ... =x’^ = 0 (12) 

Lquations (12) hold true for arbitrary | e L*, and so from (11) 
and (12) we have u'J = 0 if i ^ r. From this, u'> = 0 if / ^ r 
since «'■' = —u>'. And so if the basis e\, ..., is such that e^ ... 

er^ Lr, then the matrix of the components of the bivector u 
in terms of the basis e^e^^ Tl (that is, the matrix of coefficients of 
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the expansion (10)) is of the form 



0 

0 

0 


where A, denotes the ry^r square suhrnalrix, all other elements of 
the matrix (P) being zeros. Clearly D, = del Ar ^ 0, otherwise 
the rank of the bivector u would be less than r. 

16. Let Z-o be the zero subspace of the bilinear form «(i, t)) 
(LI is of dimension n — r). From Subsection 15 it follows that 
if x^Lr, |eZ-o. then 

U, l) = o (y) 

If (y) holds for any |eZ,o, then x^Lr\ if (y) holds for any 
xeLr, then ss Z-o. Thus, if we regard a contraction as the analo¬ 
gue of a scalar product, then Lr and LJ are similar to the subspace 
and its orthogonal complement. (Of course one has to bear in 
mind that Lr and Lu lie in distinct spaces.) 

To see the truth of the foregoing, observe that Lo is the null 
space of the transformation U and that if the vectors ei, ..., e, 

are chosen in Lr, then the vectors e’'+'.e" of the reciprocal 

basis will lie in Ln (see (11) with account taken of (P)) and will 
constitute a basis there. 

17. Theorem 1. The rank of every bivector is an even number. 

Proof. If r were odd, we would have Dr = 0, since every skew- 

symmetric determinant of odd order is zero. (This is evident if we 
multiply by (—1) every row of the skew-symmetric determinant 
and then take the transpose.) . 

18. Theorem 2. Every bivector in three-dimensional space is 
simple. 

Proof. If L is three-dimensional, then by the preceding theorem, 
for every bivector u over L two cases are possible: r = 0 and 
r = 2. In the former case, u is a zero bivector and hence we can 
write u = a Aa, where a is any vector in L. In the latter case, 
the rank subspace Lr of u is two-dimensional. Therefore, if in L 
we take the basis ei, £ 2 . ^3 provided tq, ej e Lr, then the expansion 
(10) becomes 

u = «’%, A 62 


which proves the theorem. 

19. Theorem 3. In a space of arbitrary dimension, every nonzero 
bivector u may be represented as a sum of simple bivectors, whose 
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number is equal to half the rank r of u: 

u = Pi Aqt+ .. • + Pk A Qk, 2k = r (13) 

The vectors p\, q\, ..., ph, < 7 h are linearly independent. 

Proof. First of all note that if we have an expansion like (13), 
then p\, q\, .... ph, < 7 h are definitely independent. Indeed, suppose 
these vectors are dependent. Then their linear hull £ has dimen¬ 
sion s < r and p\, q\, .... Ph, qh can be expanded in terms of the 
basis e\, ..., e, e £. Substituting their expansions into (13), we 
get for u an expansion like ( 10 ) relative to the basis in To 
over £, that is, for i, / = 1, 2, ..., s < r. But in that case the rank 
of u will prove to be less than r, which is a contradiction. Also 
note that by similar reasoning the number of simple bivectors in a 
sum of type (13) cannot be less than r/2. 

We prove the possibility of expansion (13) by induction. It is 
clear that for every bivector u of rank = 2 there exists an expan¬ 
sion of type (13), namely, u = p A q, where p, q are independent 
vectors lying in a two-dimensional rank subspace. Suppose that 
the possibility of expansion (13) has been established for all bi¬ 
vectors of rank 2, 4, ..., r — 2; then we will show that such an 
expansion is also possible for a bivector of rank r. This will com¬ 
plete the proof. 

Let the basis e\, ..., e„ in £ be chosen with the proviso that 
Cr e £,. We then have 

u = u^^e, A ^2 -f- u'^e, A ^3 + ... -f u‘'^e, A 
+ U^e2 A ^3 + • • • + A 

+ ... + u'-'^er-i A er 

Put ei = pi, u' 2 e 2 + «'%-f • ■ •-f = 9 i. From this and from 
the foregoing expansion we find that the rank of the bivector 
u — p\ Aqi does not exceed the number of vectors in the set 

^ 2 . Si .Cr, that is, it does not exceed r— 1. But the rank of every 

bivector is an even number. Consequently, the rank of the bivector 
a — P\ Aq\ docs not exceed r — 2. For this reason and by the in¬ 
duction hypothesis there exist vectors p 2 , q^, • • •, Ph, Qh, where 
2 k ^ r (that is, the number of pairs pi, q, does not exceed half the 
number r — 2), such that u — pi Aq\ = P 2 A <72 + • ■ • + Pa A < 7 *. 
From this we obtain the expansion (13). Here 2 k = r since in rea¬ 
lity 2 k <. r \s not possible due to the remark made at the begin¬ 
ning of the proof. The proof is complete. 

20. If a Euclidean metric is specified in an n-dimensional linear 
space £, we can assume that L* coincides with L. Here, by a con¬ 
traction (x, g) of two elements x, g of £ we mean their scalar pro¬ 
duct. From this and on the basis of the reasoning of Subsection 16 
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we conclude that for a bivoctor in Euclidean space L, both the rank 
subspace Lr and the zero subspace Lq are defined in L itself (we 
now Understandably write /.o without an asterisk). The subspaces 
Lq and L, are orthogonal complements of each other. 

21. Let y = Ax be a lineiy transformation specified in the n-di- 

mensional Euclidean space L. In component notation we have 
«/i = where x" are contravariant components of the vec¬ 

tor X and yi are covariant components of the vector y\ here, At, are 
covariant components of a tensor of the given linear transforma¬ 
tion. We denote this tensor by A and the transformation by y = 
= (A, x). The parentheses denote a right contraction of tensor A 
with vector x. We call the linear transformation skew if = 
= —/4s, (sec Chapter IX). 

The definitions and theorems given above for contravariant bi¬ 
vectors carry over directly to covariant bivectors (skew second- 
order covariant tensors). In the case of a skew transformation, 
tensor /I is a covariant bivector, the rank of which is equal to the 
rank of that transformation. From this and by Subsection 19 we 
have the following proposition. 

Let y = Ax be a skew linear transformation in Euclidean 
space L; if its rank is r, then in L there wiil be independent vec¬ 
tors pi, pi, ..., Ph, Pft, where k = r/2, such that the given trans¬ 
formation can be represented as 

y = ip\ APi, a:)+ ... +(pkAqk, x) 

Here, (p,- A qu x) is the right contraction of bivector Pi A p, with 
vector x\ 

{pi A P/, x) = Pi (p,, x) — p, (p„ jf) 

(Pi, jc) and {pi, x) are scalar products. 

22. In particular, if L is three-dimensional Euclidean space, 
y = Ax \s an arbitrary skew transformation of noiiJero rank speci¬ 
fied in L, then there are independent vectors p, p such that y = 
= {p A q, x). Setting a = — [p X p]. we get y = [a X x]. Indeed, 
[a X Jf] = [jf X [p X q]] = p{q. x)—q(p. a:) = (p A p, jc). Thus, in 
three-dimensional Euclidean space, every skew transformation can 
be represented in the form of a vector product (including a trans¬ 
formation of rank zero when a = 0). This result has already been 
established in Chapter IX. 

23. We conclude this section with a proposition known as Car- 
tan’s lemma. 
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Given in a linear space L two sets of vectors . . . and 

q\, .... ph, the set pi, .... p^ being linearly independent. Let 

Pi Aqi+ ... + Pa A <?A = 0 (14) 

Then the vectors pi, .... q^ are expressed linearly in terms of 
Pi . pk by the relations 

qi='Laisps (15) 

with the kxk symmetric matrix II a,s II, that is, a,, = a.,,-. Con¬ 


versely, if (15) hold and the matrix || a,-., || is symmetric, then we 
also have (14). 

Proof. (1) We first demonstate the converse. Given (15) and 
ct,s = a.,(. Then 

k k 

ZPiA?i= E atspt A ps=*Ys(ais — asi)Pi A Ps = 0 
i=l i. s=l 

Thus (14) holds true. 

(2) Now let (14) be given. We complete the set of vectors 

Pi, ..., Pk to the basis p\, ..., Ph, Pa+i .p„ in L. Then we write 

the expansion 

k 

<li=Jl a/sPs + a, k+iPk+i + ... + a/nP„ (16) 

Ssl 

From this and (14) we have 

ft ft 

EPiAPi = * Z (a/i — a,i) p, A Ps 

/, S=*l 

k 

+ E (“< ft+lPi A Pfc+I + ... 4-ainPiAPrt) = 0 (17) 

1=1 

But the outer products p, A Ps, i < s, form a basis in the sub¬ 
space of biveclors over L. For this reason and by (17), au = a^i 
if /, s= 1,2, ..., k, and, besides, a,s = 0 for s > k. Thus, rela¬ 
tions (16) reduce to relations of type (15) and matrix a,s is sym¬ 
metric. The proof is complete. 

§ 4. Simple multivectors 

1. A simple contravariant multivector in space L is an outer 
product of several vectors taken in L: 

p = a, A 02 A ... A Oft 

The number k is called the order of the multivector p. We also 
say that p is a P-vector in L. 
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2. We now present a number of propositions concerning simple 
multivectors of any order. They are a natural generalization of the 
propositions dealing with simple bivectors given in Subsections 2 
to 9 of Section 3. The proofs of these propositions, which were pre¬ 
sented in detail for the special case of k — 2, carry over in trivial 
fashion to the general case. 

(1) p = 0 if and only if tlttl vectors a\ .aj, are linearly de¬ 

pendent. 

(2) Let Cl. ah be independent; hence p ^ 0. Then Oi, ..., a* 

determine a ^-dimensional linear subspace Li, in L, namely its 
linear hull: L;, = L{ai .a*). We say that Lu is a linear sub¬ 

space of the multivector p or that the multivector p lies in L*. We 
also say that p is a direction multivector of the subspace L*. 

Let us take in Lh arbitrary vectors b .. We have 


b, =a,,a, + .. 

■. 1 

(1) 

bh = ajiia, + .. 




where a,;- are numerical coefficients. Taking advantage of the 
distributive property of outer multiplication and its skew sym¬ 
metry, we can readily prove that 

P = 6 iA& 2 A ... A ftfe = det||a, 7 l|(a| A 02 A ... A a*) (2) 

Indeed, in the termwise multiplication of the right members of 
(1) we only have to take into account those terms a,/, ■ ■ • 

... A ... A where the indices I’l, ..., 4 are all distinct 

(all other terms are zero). But for a written down (arbitrary) 
term we have 

Oi/, ... A ... A O/* = 6 /,... A ... A o* 

Therefore 

b,A ... A = • •. ai/Jo, A ... A a* 

= det||a,/||(ai A ... A a*) 

Here we have used formula (1) from Section 3, Chapter II 
(strictly speaking, this formula yields the determinant of the trans¬ 
posed matrix || a ,7 II, since the summation here is over the second 
indices of the elements a,;- and not over the first indices, as in Sec¬ 
tion 3, Chapter II). 

Let us once again stress that the factor det || a, 7 -1| is obtained 
due to the linearity of the product in each argument and to the 
skew symmetry of the product in each pair of arguments. In the 
sequel we will carry out similar computations without going into 
so much detail. 
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Thus, q — Dp, that is, the multivector q is proportional to the 
multivector p and the proportionality factor D = det || a.j II. 

Conversely, if q = b[ A bz A ... A bk = Xp, where A is a non¬ 
zero scalar, then b\, ..., bk are independent and lie in Li,\ that is 
to say, equations (1) hold, and X = D = det || an ||. 

(3) In particular 

6i A i»2 A ... A = fli A 02 A ... A a* 

when and only when bu bi, ^ Li, = L{ai, ..., a,,) and 
det II a.j II = 1. 


3. Suppose a linear space L is equipped with a Euclidean metric. 
Let bi, ..., bh e L/, = L(ai, ..., a^). We imagine a nonzero mul- 
tivector f)i A f ?2 A ... A in the form of an oriented fe-dimen- 
sional parallelepiped in Lh constructed on an ordered set of vec¬ 
tors bu b 2 , ..., bh. The volume (^-dimensional) of this parallele¬ 
piped will be called the volume of the multivector b\ Ab^ A ... A 
A bh- We will use the term unit multivector for one whose volume 
is equal to unity. The volume of an arbitrary simple multivector is 
computed in accordance with Subsection 6, Section 13, Chap¬ 
ter VIII. 



Suppose the original multivector Oi A ^2 A ... A Oft is a unit 
multivector and that the subspace Lu = L{ai, ..., Oh) is oriented 
by an ordered A:-tuple of vectors ai, ..., O;,. Then det || an || = V, 
where V is the ordered ^-dimensional volume of the multivector 
fei A &2 A ... A bh. Equation (2) can now be written as 

bi AbiA ... A = V • a, A 02 A ... Aa* (3) 

4. Thus, simple multivectors of order k lying in Lh are depicted 
as oriented parallelepipeds of the subspace Lh. By (3), parallele¬ 
pipeds of equal volume and the same orientation in Lh represent 
one and the same multivector (for k = 3 see Fig. 72). 

5. In L take an orthonormal basis ei, ..., e„. Let us consider 

the set of arbitrary vectors b\ . bk ^ L. We have 

= . 


( 4 ) 
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We choose some basis vectors provided t|</ 2 <-..< 4 . 

They determine a fe-dimensional coordinate plane, which we de¬ 
note by .../ft (to be more precise, f., ...i* is the linear hull 
of the vectors ei^, ..., ei^). We will assume that the plane Ei ^... 
is oriented by the multivector A A ... A Consider any 

vector bf of the fe-tuple 6 |, A., 6 *. Denote by 5/ the projection 
of bj on the plane 

b, = x‘'e.^ + x]^e.^ + ... 

We say that the multiveclor b, Ab.^ A ... Ab^ is the projection 
of the multivector &, A 6.2 A ... A on the fe-dimensional 
plane Ei,...i^. By (3) we have 

5iAft2A ... A&4 = V''''^"’'*e/, Ae,^ A ... A 

Here, ' ^ Js the oriented A-dimensional volume of a paral¬ 
lelepiped constructed in £<,... on the vectors fti, ..., bh- At the 
same time, for the multivector 6 i A &2 A ... A fc* itself, we find, 
via the rules of outer multiplication, 

b^Ab^A ... Abk = *T,y^'^^"'‘’‘ei^Aei^A ... A (5) 


where, as usual, the starred sigma denotes summation with the 
proviso that i\ < h < ■ ■ ■ < k- 

From (5) we conclude that, relative to an orthonormal basis, 
the components of the simple multivector 6 i A 62 A ... A 6 /, are 






k = 


anes 


•• ‘k’ 

x‘' 

■^1 


a:'* 

... At, 

x‘' 

X^2 

^2 

... Atj* 

■h' 


x'k 
■ • • ^k 


( 6 ) 


6 . In Subsection 5 we discussed projections and volumes. How¬ 
ever, the algebraic manipulations that resulted in formulas (5) 
and ( 6 ) do not depend on the metric. For this reason, if (he vec¬ 
tors (4) are specified in a linear space L relative to some basis 

ei . e„, then formula (5) witli coefficients ( 6 ) holds true for the 

multivector hi A ... A &/<. 
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§ 5. Vector product 


1. Using simple multivectors, it is possible in a natural way to 
extend the notion of a vector product to the case of any number k 
of vector factors in Euclidean space of any dimension n(n > k). 

First recall that in Ez the components of a vector product z = 
= X f/] in an arbitrary basis are expressed as 

z =Lg y =L eap.Jf y 

Here, g'-’ is the contravariant metric tensor, e^py is the covariant 
discriminant tensor, and are the mixed components of the 
discriminant tensor which are obtained by raising the last index 
(see Sections 9 and 13, Chapter VIII). 

Now let jci, ..., Xh be ft < n vectors given in and let {x/} 
be the components of the vector Xi in an arbitrary basis ei, ..., 
of En- We will construct a tensor z via the following formula: 


z'* 


.. i 


n—k 




==Zg'‘^‘ • • • 




l-"“fcPi--Pn 


x“> 

-k ' 


( 1 ) 


Here, as usual, e, , is the discriminant tensor and g'3 is the 

metric tensor of E„. The discriminant tensor is axial and has 
skew symmetry with respect to all indices. Therefore z is an axial 
multivector, that is, an axial skew-symmetric tensor (of order 
n —ft). According to the general rules (see Section 9, Chap¬ 
ter VIII), we can write it in terms of covariant components. We 
obtain a simpler formula: 




xy 


( 2 ) 


The multivector z of order n — ft defined by (2) is called the 
vector product of the vectors X|, ..., x;, taken in that order and 
we will denote it by [xi X • • • X x/,]. 

In other words, 

.- = l.v,X ... -'"-e,, ... (3) 


We will now show that the multivector z is a multidimensional 
analogue of the ordinary vector product. 


2. The properties of a vector product (the proof is given in Sub¬ 
section 3 below) are as follows. 

(1) A vector product is linear in each factor. For example, for 
the first factor we have 


[(av;-F p.v") X A-o X ... Xa-,] 

=«[.v; X -v, X ... X A-ft]+p[x" X .V2 X ... X X,] 
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(2) A vector product is skew-symmetric in any pair of factors. 
For instance, for the first pair of factors we have 

[^I X JC2 X • • • X Jtfc] = — [^^2 X JCi X • ■ • X ATfe] 

(3) [JCi X • ■ • X= 0 if and only if the factors are linearly 

dependent. • 

(4) A vector product is a simple multivector. To be more precise, 
there exist vectors t/\, ..., y,,-!, e E„ such that 

[aTi X •>C 2 X ... X = //i A !h A ... A tjn-k 

(5) If the vectors Xi, ..., Xh are linearly independent, then the 

vectors yi, ..., yn-h are also linearly independent. Thus the sub¬ 
spaces Lh = L{x .. Xh) and Ln-h = L{yu .... yn-h) are fully 

defined. These subspaces L;, and L„-h are orthogonal complements 
of one another. 

(6) If the vectors X\, .... Xh are linearly independent, then the 

orientations of Xi,---,Xh in Lh and . .. in Ln-h arc such 

that the combined set of vectors x\,...,Xh, </i,..., yn-h has the 
same orientation as the basis Ci, ..., Cn (that is, as that basis in 
£„ relative to which the components are specified in the original 
equations (1)). 

(7) The (rt — -dimensional volume of the miiltivector 2 = 
= J/i A i /2 A ... A yn-h is numerically equal to the /^-dimensional 
volume of a parallelepiped constructed on the vectors Xi, ..., Xh- 

3. Proof. Properties (1) and (2) follow immediately from equa¬ 
tion (2). Now let us prove property (3). From properties (1) and 
(2) it follows that z = 0 \{ X\, ..., Xh are linearly dependent. 

Now suppose x\, ..., Xh are linearly independent. In the linear 

hull Lh = L{x\ . Xh) we choose an orthonormal basis Oi, ..., a* 

with orientation the same as that of the set of vectors X|, ..., Xh, 
and complete it to the orthonormal basis oi, ..., a;„ ..., a„ in the 
entire space which has the same orientation as the original 
basis e\,. e„. We have an expansion of the form 

Xi = a,.,ai + ... + aihah 


and det II a,j II = V, where F > 0 is the /e-dimensional volume of 
a parallelepiped constructed in Li, on the vectors xu ..., Xh. Due 
to the linearity and skew symmetry of a vector product, we get 

[xiXxzX ... X.t*] = det||«,/||[n, Xa 2 X ... Xoftl 
= F[fl| X«2X ... Xflil 


whence and also from (1) 


a*Pl ... fin-h 


a h 


t4) 
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The tensor equation (4) holds true in any basis with orientation 

the same as that of et . e„. In particular, in the orthonormal 

basis fli, ..., a„ we have g'P = 6'P, af = 6f, so that in this basis 


whence it follows that the multivector z has, for i\ < h <. ■ ■ ■ <. 
a in-h, a unique nonzero component, namely, 

= (5) 

From (5) it is evident that in this case the linearly independent 
factors [xiX ... X Xk] ¥= 0. This proves property (3). 

We now prove properties (4) to (7) of Subsection 2. Note that 
there cannot be more than one simple multivector with properties 
(5) to (7) (these properties specify a subspace and also the volume 
and orientation of a multivector). We construct a simple multivec¬ 
tor u equal to the vector product z and possessing properties (5) 
to (7). Set 

« = (Faft+,) A afc +2 A ••• An,. 

From (5), Section 4, we find that 

«‘+'-" = F (6) 

and that the remaining components «'' "■'«-* are zero, provided 
fi < t 2 < • •• < Comparing (5) and (6) we get 

u = z 

Therefore, setting 

yi = Vaii + j, y 2 — (lk+ 2 t •••* yn-k — dn 

we see that all the properties (4) to (7) of Subsection 2 hold true. 

4. Tlie notion of a vector product helps in an arbitrary (in par¬ 
ticular, an oblique-angled) basis to solve the following geometric 
problems. 

(a) Given in a Euclidean (vector) space E„ a subspace Lh. Find 
its orthogonal complement 

(b) Given in a Euclidean (point) space E„ a point A and 
a ^-dimensional plane P* that does not pass through A. Pass 
through A a plane Pn-h of dimension n — k orthogonal to Pj,. 

(c) Under the hypothesis of (b), drop from point A a perpen¬ 
dicular on P;, (that is, find a straight line orthogonal to P* pass¬ 
ing through A and intersecting P/,). 

(d) Under the hypothesis of (b), find the shortest distance from 
poi’it A to plane P/,. 
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(e) Given in space the skew planes and Pi. Construct 
their common perpendicular. 

(f) Under the hypothesis of (e), find the shortest distance 
between Pi, and P(. 

Without going into details, consider the outlines of the solu¬ 
tions of the foregoing problems. 

To solve (a) it suffices to take a basis ati, ..., X/, in Lu and con¬ 
struct the subspace L„.i, of the niultivector [ati X ... X ac*]. The re¬ 
quired computations are given below in Subsections 5 and 6. 

Problem (b) clearly reduces to (a). 




Let P„-h be a plane obtained in the solution of problem (b). It 
is easy to establish (say by Theorem 5, Section 7, Chapter III) 
that Pn-h intersects P* and to prove that their point of intersec¬ 
tion is unique. Denote that point by B. Then the straight line AB 
will be the desired perpendicular in problem (c), and the length of 
the segment AB will be the desired distance in problem (d). 

The solution of problem (e) constitutes a multidimensional ge¬ 
neralization of the construction of a common perpendicular to two 
skew lines in £ 3 . 

Namely, let us consider the direction subspaccs Li, and Li of 
planes P* and P/ respectively. Construct the sum L' = Lh + U and 
its orthogonal complement t. Finding the latter reduces to solving 
(a). Let the intersection Li, H Li have dimension m. Then the di¬ 
mension of £ is p = n — (k + I ~ III). 

Passing plane P through an arbitrary point M of plane Pi, in 
the direction of subspace £ = £;, 0 £ (Fig. 73). P has dimension 
k P — n A- m — /. By virtue of the construction of £, the inter¬ 
section £ n £( coincides with the intersection £h D £; and there¬ 
fore has dimension ni. Using Theorem 5, Section 7, Chapter III, 
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we find that the plane P intersects the plane Pi. Let C be a point 
of their intersection. It can be shown that for m ^ 1 the point C 
is not unique, in contrast to the familiar three-dimensional case. 

Through point C draw plane P in the direction of the sub¬ 
space L. Note that P depends on the choice of C in the plane 
P/ n P (see Fig. 74, where n = 4, P* = L#, = L{eu 62 ), Pi passes 
in the direction of /. {f’ 2 . ^ 3 ). P is one-dimensional). Regarding P/, 
and P as planes in a (k /?)-dimensional affine space P and once 
again applying Theorem 5 of Section 7, Chapter Ill, we find that 
Pft and P intersect in a point D (Figs. 73, 74). The straight line CD 
intersects Ph and P/. It is contained in P so that its direction vec¬ 
tor belongs to L and therefore is orthogonal to all vectors in L* 
and Li. Thus, the straight line CD is the desired common perpen¬ 
dicular (generally not unique) to Pi, and P;. We leave it to the 
reader to prove that the length of CD is the shortest distance 
between Pt, and Pi, so that the solution of problem (f) is also ob¬ 
tained. 

Note that the finding of subspaces, planes and points mentioned 
above reduces to the solution of certain systems of linear equa¬ 
tions. 

5. We return to problem (a). Suppose that Li, is specified as the 
linear hull of independent vectors X\, ..., Xi, expanded in terms of 
a basis ei.e„: 

Xi = 'L xje, 

By Subsections 1 and 2 we have 

2 = (jClX ... X >:*] = */, A ... A ... 

where z'l’are determined from (1) and the vectors i/t, ... 

..., constitute a basis in the desired subspace /,„-*• Write 
down the expansion of the vectors t/i, tjn-h in terms of the 
basis Cl, ..., c„: 

Vi = S il{e, 

with the unknown coefficients and construct the matrix 


U! •• 

. «/r* 


1 y'n-k • • 

“n—k 

i • • • Vn-k 


As in Subsection 5, Section 4, we can prove that each of the com¬ 
ponents of the multivector z is equal to a determinant of 

order n — k formed by the columns of matrix Y with number labels 
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/i, Therefore, although we do not know the numerical 

values of the elements of Y, we do know the numerical values of 
all its minors of order (n — k). For the sake of simplicity, suppose 
the left (indicated) minor of matrix (7) is a basis minor, that is, 
that 

The vector h = 2 belongs to if and only if 



. . . K"-** . 

. . «« 



y\ 

... r/p ‘ 1 . 

.. 

= n — k 

(8) 

y'n-k 

l/n-k j 

• • • ; • 

•• ‘Jn-k 




We now construct a minor of order (n —* + 1) of matrix (8), the 
minor being formed by columns labelled 1, ..., n — k,l, where 
n — k<i\^n. Such a minor is definitely equal to zero. Expand 
it by the first row to get a linear equation for the components of 
vector u\ putting l = n — 1, ..., n, we obtain the homoge¬ 

neous system of equations 

Au = Q (9) 

which has & ky^n matrix of the form 



k 


An important point is that all nonzero elements of A are certain 
components e'l of the multivector z. Any vector u ^ L„-h 

satisfies the system (9). But rank A — k and so the vectors of 
Ln-h form the entire set of solutions of the system (9). Thus, (9) 
determines the desired subspace and so yields a solution to pro¬ 
blem (a). 

Of course problems (a) to (f) can lie solved differently, say by 
using Section 4 of Chapter VI11. 

6. Now let e\, .... be an orthonormal basis in £„■ We then 
have the following equations: 

[ei^Xe,^X ... XeiJ = e./, Ae/, A ... A 


( 10 ) 
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where the combined collection of indices (I'l, .... 4 , /i./„_»,) is 

an even permutation of the natural sequence (1, 2, ..., n). To see 
the truth of equations ( 10 ), it suffices to note that the simple mul¬ 
tivector A Cy, A • • • A possesses properties (4) to (7) of 
Subsection 2 relative to the vector product X X • • • Xe,j^]. 
But only one multivector can have these properties. 

Formulas (10) provide a multiplication table of basis vectors 
ei, ..., e„ and permit computing the vector product of any vectors 
via a termwise multiplication of their expansions in terms of the 
basis ei, e„. With their aid it is sometimes possible, without 
solving (9), to find the factors yi, ..., (/„-/, that enter directly into 
the expression z = A ... A yn-h- This also provides an alter¬ 
native solution of problem (a). 

Example. Given in a four-dimensional Euclidean space, in the 
orthonormal basis e\, 62 , 63 , the vectors Xi = 2 ei + €2 + 63 , 
X 2 = ei + 64 . Find the direction bivector of the orthogonal comple¬ 
ment of the linear hull L(xi, X 2 ). 

Solution. For the desired bivector we can take the vector pro¬ 
duct [xi X ^ 2 ]- Multiplying the vectors x\, X 2 termwise and using 
formulas ( 10 ), we find 

[Xi X •'^ 2 ! = [(2^1 + ^2 + ^ 3 ) X (^1 + 64 )] 

= [e: X e,] + [63 X fill + 2 [e, X ^ 4 ] + X ^ 4 ] -f [^3 X 
= ^4 A |?3 -f 62 A ^4 + 2^2 A ej + 63 A ^1 + fii A ^2 
= e, A ^2 — 2^3 A ^2 — ^4 A ^2 — ^1 A ^3 -f 263 A ^3 -f £4 A ^3 
= (e, — 2^3 — 64 ) A (bi — ^ 3 ) 

Thus for L(.V|, JC 2 ) the orthogonal complement is L(y\, y 2 ), where 

yi = B\ — 2^3 — 64 , //2 = C 2 — 63 . 

§ 6 . Outer forms and operations on them 

1. Let L 1)0 an n-dimensional linear space, .vq, ..., x/, a set of 
arbitrary vectors in L, {o(xi, ..., x;,) a multilinear form with vec¬ 
tor arguments Xi . Xh. 

We use the term alternation of the form a)(xi, ..., x*) for a mul¬ 
tilinear form denoted by (co(xi.x/,)) and defined by 

((l)(X|, ..., Xh)) = ^1 ‘ 2 * ■. ■ -'‘/j’ •••’ 

where the summation over each of the indices /i. /a runs from 1 

to k. In particular, for a linear form co(x) we have 

{(0 (X)) = (0 (x) 


( 2 ) 
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For a bilinear form //) we have 

(o) (.V, ij)) = Y,- (o) (.T, y) — (s> (y, .v)) 

For a form of three arguments wc have 

I 

(co (x, y, z)) = -gf 2) + W (//, 2, x) 

+ (0 {z, X, y)~w (//, .V, z) — © (a:, z, y) —at (2, y, a:)) 

From formula (1) it is clear that 

(1) (w) is indeed a multilinear form, that is to say, it possesses 
linearity in each argument; for example, in the first argument: 

+ x^, ..., Xft)) 

= a(©(x', Xj, ..., X;^)) 4-P (© (xf, Xj .x^)) 

(2) (©(X|, ..., Xk)) has skew symmetry in any pair of argu¬ 
ments; for instance, in the first pair: 

<©(X|, X2, X3.Xa)) = —(©(Xj, X|, X3, .... Xu)) 


2. The form © will be termed skew if (©) = ©. 

By equation (2) every linear form, that is, a form in one argu¬ 
ment, must be skew. 

From the definition of an alternation it follows that a skew form 
has skew symmetry in every pair of arguments. 

Conversely, if a multilinear form w has skew symmetry in any 
pair of its arguments, then it is skew in the sense of the foregoing 
definition, or (©) = ©. The proof is left to the reader. 

3. Let ©(xi, ..., Xft) be a skew form. If the vectors X|, ..., xj, 
are linearly dependent, then the corresponding value of the form © 
is zero. 

The assertion follows from te properties of linearity in each ar¬ 
gument and skew symmetry in each pair of arguments. 

4. If the number of arguments of a skew form © exceeds the di¬ 
mension of the space (k > n), then © is identically zero. 

This assertion is a consequence of Subsection 3. 

6. The important thing to grasp is that the skew form 
©(Xi, ..., Xa) is a function of a simple multivector p = Xi A ... A 
A Xft. In other words, the value of © remains unaltered if the vec¬ 
tors Xi, ..., Xft are replaced by other vectors t/i, ..., «/*, provided 
that 


yi A ... A yk = xi A ... A Xu 
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Let US now prove this. Due to Subsections 3 , 4 , it suffices to con¬ 
sider the case where X[, ..., Xi, are independent. Then the equation 
y\ A ... A yi, = Xi A ... A Xk is equivalent to the fact that we 
have the expansions 


Pi =o.\xXx + . 

• -f- 1 

yk = 0*1^1 + • 

• + nukXk ' 


where det || a,j || = 1 (see Subsection 2 , Section 4 ). From equa¬ 
tions ( 3 ) and also from the properties of linearity and skew sym¬ 
metry of the form © we have 

©(i/i. •••. f/*) = det||a,/||w(j;|, ..., = .... Xk) 

which proves our assertion. 

6. By Subsection 5 , the domain of definition of a skew form 

©(jti . Xh)= (o(p) is the set of all simple multivectors of or¬ 

der k. 

Definition. A skew form as a function of a multivector is termed 
an outer form. The order k of the argument p is conventionally 
called the degree of the outer form ©(p). 

We also say that o)(p) is a fe-form, which is denoted frequently 
by ©'‘(p). 

7 . Outer forms of a given degree k constitute a linear subspace 
in the space of all multilinear forms with arguments JCi, ..., Xh- 
True enough since outer forms of one and the same degree can be 
added and multiplied by a scalar to yield outer forms of the same 
degree (because a linear combination of skew forms in Xi, ..., xi, 
is a skew form in the same arguments, which is obvious). 

8. We now define an outer product of two outer forms of ar¬ 
bitrary degree. The outer product of the form w*(p), p = Xx A 
A ... A Xu multiplied by the form ©'(«7), q ={xu + \A ... A Xu+i) is 
an outer form of degree / denoted by ©'*(p) A ©'(9) and de¬ 
fined by 

©*(p) A ©'(<?) ==-^^;^y^(©*(x:i, .... Xk)(o‘{Xu + i, .... Xu+i)) 

where (©'‘(Xi, ..., A:fc)©'(Xft+i, .... Xu+t)) is the alternation of a 
multilinear form of degree k + l obtained by ordinary multiplica¬ 
tion of ©'• by <i)'. 

We now prove that ©'•(p) A ©'((?) is an outer form of degree 
k 1 of the multivector p A q: 

©* (p) A co' (<7) = ©*’+' (p A q) 
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Proof. The fact that M''{p)AM'{q) is a skew form of degree 
k + l follows directly from the definition of an alternation. Then 
we have 

(o*(p) A w'(^) = <o*+'(.v.Art,,) = (o*'+'(.v, A ... Ax^+i) 

But by Subsections 10 , 11 , Section 2 , 

XiA ... A at* A Xk+i A ... A x^+i 

= {Xi A ... A Jf*) A (.<-4 +, A ... A Xk+i) = p Aq 
which is what we sought. 

9 . We now prove that an outer product of outer forms has the 
following properties. 

(1) (ato*) A A (a©') = a((0* A ©0 for any scalar a and 

any outer forms ©* and ©'. 

( 2 ) (©f + ©*) A ©^ = ©f A ©^ + ©2 A ©^ for any outer forms 

©*, ©^, ©'. 

( 3 ) The outer multiplication of outer forms is skew-commuta¬ 
tive, namely, 

©* A ©^ = (— 1 )*^ ©^ A ©** 

for any ©'■, ©', whence it follows, in particular, that for ©' = ©'• 
we have w* A ©* = 0 for any odd k. 

( 4 ) The outer multiplication of outer forms is associative: 

(©* A ©0 A ©■” = ©* A (©^ A ©'") 
for any ©*', ©^ ©"*. 

10 . Thus, for outer forms we have a complete analogy with 
Grassmann algebra (see footnote on page 352 ) which we defined 
for contravariant multivectors,*that is to say, multivectors in L. 

However, we will demonstrate in the next section that this is 
no mere analogy but is precisely a Grassmann algebra; true, it is 
a Grassmann algebra of covariant multivectors, that is to say, 
multivectors in the space L*, which is conjugate to the given 
space L. 

In this way the properties enumerated in Subsection 9 will be 
proved. 

§ 7 . Outer forms and covariant multivectors 

I. Given a multilinear form 

©(Af.. (1) 

where ac^ are the components of a vector Xj relative to a basis 
ei, ..,, e„ in L. We know that with each such form there is asso- 
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cialcd invariantly and one-to-one a tensor; 

(0 = X • • • e** 


( 2 ) 


where e', ..., e" is the basis in L* reciprocal to the basis eu ..., e„ 
in L (see Section 7 , Chapter V). 

The numerical value of the multilinear form ( 1 ) on specifically 
taken vectors Xi, ..., x/, is a complete contraction of tensor (2) 
with these vectors, vector xi being contracted with e‘', vector X2 
with e‘', and so on. 

Now recall that every vector of the conjugate space L* is a li¬ 
near form in a vector argument in L. Besides, the value of this 
linear form on the given vector x of L is precisely the contraction 
of X with that vector of L* which represents the form. 

In the given case, since the bases ei, ..., e„ and e', ..., e" are 
reciprocal, we have for the vector Xj = x'je^+ ... 


e‘(X|) = x]e*(e^)+ ... -f (e,) + ... + (e^) = x\ ( 3 ) 


Now in place of ( 1 ) we can write 

©(jc,, ..., = ... ifee''(JCi) ... e'Hxk) (4) 


Formula ( 4 ) shows that an arbitrary multilinear form 
©(x', ..., x'*) can be expanded into a sum of products of inde¬ 
pendent forms e'(x), ..., e’*(x) in precisely the same way that the 
tensor w of this form is expanded into a sum of products of first- 
order tensors e\ ..., e”. This expresses the familiar isomorphism 
between multilinear forms and their tensors. For this reason, every¬ 
thing that has been said about multivectors can be carried over 
directly to outer forms. 

However, two points must be kept in mind. 

( 1 ) In Sections 1 to 4 we dealt with contravariant multivectors. 
We now deal with multilinear forms to which correspond covariant 
tensors. Therefore, first we have to define covariant multivectors. 
This definition should be done in exact analogy with the definition 
of contravariant mullivectors: a skew covariant tensor is called a 
covariant multivector, and a skew tensor is defined to coincide 
with its alternation. 

( 2 ) The alternation of forms is defined here without any direct 
analogy with the alternation of tensors. That is why, in Section 5 , 
we introduced the symbol ( ) instead of [ ]. Therefore we must 
prove the following assertion. 

// ©(xi . Xh) is an arbitrary multilinear form (1), and w is 

its tensor (2), then the tensor of the alternation of this form is 
equal to the alternation of its tensor (that is, the form 
(w(xi.A'/,)) has the tensor [©]). 
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Proof. The alternation of a form is defined in Section 5 with the 
aid of permutations of the arguments . . . namely, 

(©(a;,, •<■*)) =-fei-^ ... ft*® (jc/,. •••, x/^) 


whence and also from (4) we iiave 

. = ••• 1 ,^'' ^'(^ 1 ,) • • • ( 5 ) 


The indices I'l, 4 in (5) assume all values from 1 to n, the in¬ 
dices /i, jh form all possible permutations of the numbers 
1 ,...,^. 

On the other hand, from the definition of the alternation of a 
tensor and due to formulas (8) and (9), Section 1, we have 




0 ), 


g'l 




( 6 ) 


Here all the indices I’l, .... 4, a\ .a;, independently run 

through all values from 1 to n. 

We have to prove that (6) yields the tensor of the multilinear 
form (5). We know that (6) is the tensor of the multilinear form 

Z 1 • • • • • • “V" ^ 


It is therefore necessary to establish the coincidence of the multi¬ 
linear forms (5) and (7). Clearly it suffices to verify the equation 


Z 6(‘;;; (JT/J = S 6'> (^,) (x^) (8) 


for any fixed values of the indices I'l, ..., 4. assuming that they 
are all distinct (otherwise (8). yields 0 = 0). In the right-hand 
member of (8) only those terms are nonzero for which the indices 
ai, aft form a pcimutation of I'l, ..., 4- For this reason, the 
right and left members of (8) have the same number k\ of nonzero 
terms. There is a natural one-to-one correspondence between them. 

Precisely, for let j\ ... 4 be an arbitrarily chosen permutation 
of the numbers 1,...,*. Then by permuting the forms 
..., in the appropriate term of the left member 

of (8) we get a new arrangement: 

e'* (X/,) ... (x/,) = c"- (.V,) ... e"* (.v,) (9) 

Associated with it is a very definite term of the right-hand member 
of (8). This correspondence is one-to-one since distinct permuta¬ 
tions of (/i.4) yield in (9) distinct permutations of 

(ai.aft). 
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It remains merely to note that the coefficients of the appropriate 
terms on the left and right side of (8) are equal. But, indeed, when 
passing from the left member of identity (9) to its right member, 
we permute the factors • • •. ^'*^(-'^ 4 ) such a fashion that 

the number labels of the arguments increase. This same permuta¬ 
tion of factors carries the. indices (fi, ..., 4) into the indices 
(ai, ..., ah)- Hence 


6 (' 



‘k 


( 10 ) 


since, as has already been mentioned, the permutations on the 
right and left of (10) are of the same parity. Thus, we have estab¬ 
lished the truth of (8), whence it follows that [co] is the tensor of 
the form ((o(Af|, ..., Xi,)). 


2. From now on there is no need to retain the symbol ( ) and in 
future we will denote the alternation of a multilinear form by 
square brackets just as we do the alternation of a tensor; that is, 
we assume 

[q (a:,, ..., jc^fe)] = (cD (jc,. Xh)} (I I) 


3. Now we can carry over directly to forms the basic results 
and relations established for multivectors in Sections 1 to 4. 

(I) Given the linear forms (each in one argument) Mi(jci), 
U 2 (X 2 ), ..., Uh(Xh). The alternation of their product can be written 
thus: 

[Mi ••• “ft ^ ^ 1 '... ••• “/j(-’^ft) (12) 


Note that in (12) the arguments Xi, ..., Xh in all terms are ar¬ 
ranged in a natural sequence. 

In particular, for the basis forms e', ..., e", the total number 
being any number k and with any kind of arrangement, we have 




(2) If (o(a:i .Xft) is any form written as (4), then 

I®('^i. ^a)] = E®4 •.■4[^''(-«i) ... e'*(jfft)] 

= Z®[i,.../ft]e'‘(Ari) ... e'*(A:ft) 

In this connection, see Subsection 15, Section 1. 

(3) Skew forms may be described by the condition 




= (0 


I'l ••• 4] 
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that is, the condition of skew symmetry of their coefficients with 
respect to any pair of indices (see Section 1, Subsection 18). 

(4) The outer product of several basis forms e‘(A:), e’'{x), 

taken in any quantity and in any order, may be expressed by the 
formula 

e‘'{Xi) A e^Xi) A ... A = *! [e' (a:,) (xj) ... (14) 

Formula (14) may be proved like (7), Section 2, since we can 
understand alternation in the sense of formula (13) (which is 
constructed in the same way as (9) of Section I). 

4. From formulas (3), (13) and (14) we get 
e'* (x,) A e‘^ (xj) A ... A e'* (x,) = E sj; i.'.' ' 

(15) 

Here 



x‘^ 

• • •'l 

4' • 

• • 


is a minor of order k of the matrix 



made up of the components of the vectors X|, ..., Xh. The indices 
ti, ..., ifc indicate the numbers of the columns of matrix X that 
participate in the minor F'l The word “minor” is used in a 
conventional sense since it is not assumed that the indices 
ii, ..., 4 proceed in increasing order. 

Formula (15) yields numerical values of one-member outer 
forms, which are outer products of the basis forms e^{x), , e”(x). 

Like any outer forms, these one-member forms are functions of a 
simple multivector; in the given case, of the multivector p = 
= Xi A X 2 A ... A Xft. The numbers K'* ' coincide with the com¬ 
ponents of the simple multivector p (sec Section 4, Subsections 5, 
6 ). If the space L is equipped with a Fuclidean metric and the 

basis e\ .e„ is orthonormal, then F*' '* for I'l < I 2 < ... < 

< 4 , are oriented volumes of the projections of the multivector p 
on the coordinate planes (see Section 4, Subsection 5). 

5. An arbitrary outer form (o'Mp). p = x\ A x^ A ... A Xh, may 
be expressed in the following manner, in terms of the basis forms 
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(as in Subseclion 7, Section 2): 

“''(/’)= . A ... Ae‘’‘(Xk) (16) 

Recall that the starred sigma denotes summation with the proviso 
that ii < i 2 < .. ■ < 4- 
From (15) and (16) we have 

= (16a) 

6 . Since the values of the linear form e' on an arbitrary vector 
X = x'ei + ... 4 - x^en is equal to the component x‘, the symbol x‘ 
is used to denote the linear form e'{x)= x‘, and instead of (16) 
we write 

= A x‘^ A ... Ax‘’< (17) 


7. For an outer form of degree two (or, as we sometimes say, 
an outer quadratic form), the expansion (16) in three-dimensional 
space looks like this; 

to = a^. e' (a'i) a (?■ {X 2 ) + (."fi) A e® (xz) + 00 , 36 ' (a,) A (xj) (18) 

Besides, taking into account Subsection 6 , we can also write 

(0 = (o, 2 x' a x'* + ooss-v^ A X® + (O 13 X' A x^ (19) 


Both (18) and (19) express the same thing in different notations. 
Formulas (17) and (19) are conventional and are apt to give rise 
to misunderstanding. When using them, one has to remember that 
x', x^, ... are not the components of vectors but denote linear 
forms; for instance. 


a' A x^ = e' (a,) a e* (xg) = 


6 '(x,) e^(x^) 
e'(x,) e^(x,) 


This is more clearly brought out in a numerical case. Let 
X, = a| 6, -f x]e.^ + x]e3 = 26 , -f- 862 + XJ63, 

Xj = x^ 6 , -f A^ej + xle^ = — l 6 , -f 562 + x®63 

Then 


x' A x^ = 


2 3 
-1 5 


= 13 


Here wc have found the numerical value of the form x' Ax* = 
= e' A 6 * on a specific pair of vectors. But one must bear in mind 
that the form 6 * A e* itself, as an element of Grassmann algebra, 
is not a number. 
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8 . In integral calculus and in the theory of differential equations 
use is made of outer forms wliose arguments are the differentials 
of variables. Then in expressions of the type (17) or (19) one 
writes dx^, dx" instead of .v', .... x". For instance, one fre¬ 
quently encounters an outer form of type (19) in tlic notation 

(a — Pdx' A dx- -f dx- A dx'' -f R dx^ A dx' 


where the coefficients P = 0 ) 12 . Q = ( 1 ) 23 , R = —C 013 are them¬ 
selves functions of the arguments jc', x^, x^. 


9. An outer form in the notation of (17) is convenient in cases 
where one has to pass to a new basis and the old components are 
given in terms of the new ones: 


x' = P\>x''+ .. 


x'' = p’},x''+ . 

. -FPa'x"' J 


( 20 ) 


Then we have the following equation: 

x‘' Ax^ A ... A x‘x=* X D‘i ■■■ V' A x''^ A 

h ■■■ Ik 


A (21) 


where is the minor of the matrix 

/i ... Ik 


intersection of rows labelled ij, 
/p •••. ik{l'x<i2< <ik)- 


Pi' 

... P'nP 

P" 

... K' 


at the 

and columns labelled 


Formula (21) is proved by termwise multiplication of the linear 
forms x**, which are to be viewed as linear combinations 

of the forms x", x" in accord with (20). We then get 


V. A A ... A .<'• = (Z P'^x') A ... A (Z P‘,lx''‘) 


ac= 4I 

l\ 




“•Soli;::!!*''A... A*'* 


Remark. Formula (21) can be written immediately on the basis 
of Subsections 5, 6 , Section 4. In formula (5) of that section it 
suffices merely to take the forms x^', x** for the vectors 

bi, ..., bh and the forms x‘', x”' for the vectors ei. en- 


13-'•61 
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10. We conclude this section with three important theorems on 
outer forms that follow from definitions and the results of Sec¬ 
tion 3 by replacement of L by L*. 

Theorem 1. The rank r of any outer form of degree two is an 

even number that does not exceed the dimension of the space 

(r — 2m ^ n). 

Theorem 2 . If ( 0 ^(jc A y) is an outer form of degree two of rank 

r = 2m, then there is a system of independent linear forms 

Pi . Pm, qi . Qm such that 

(o2 (;c A I/) = Pi (jc) Aqi{y)+ ... + Pm (x) A Pm (h) 

Theorem 3. (Cartan’s lemma for outer forms). Let pi(x), ... 

.... Ps(Jr), ?1 (."f). Qsix) be linear forms and let pi{x), , p^ix) 

be independent. To have 

Pi (-^) A (?i (p) + ... +ps (x) A Ps (P) = 0 

it is necessary and sufficient that there exist expansions of the 
type 

qiix) = aiiPi{x)+ ... -i-ai.psix) 

where a,j = aji (a,j are numerical coefficients). 

§ 8. Outer forms in three-dimensional Euclidean space 

1. To give a geometric illustration of the material of the preced¬ 
ing two sections we will consider outer forms in three-dimensional 
Euclidean space £3 and we will show that they are closely related 
to the familiar operations of elementary vector algebra. We assume 
that ei, 62 , ^3 constitute an orthonormal basis, that a, b, p, q are 
fixed vectors, and that x, y, z are variable vectors ranging over the 
whole of £3. We denote the vector components by lower indices 
since the position of the indices is immaterial due to the orthonor¬ 
mality of the basis (in such a basis the contravariant components 
are equal to the covariant components of a vector, see Chap¬ 
ter VIll). 

By Subsection 4, Section 6, we have to consider ^-forms only 
for k=\,2, 3. 

2 . Recall (from Chapter VIII) that any linear form in Euclidean 
space may be represented as a scalar product of a constant vector a 
into a variable vector x, and every vector a e £3 determines a 
linear form (a, x), which we now denote by (oJ,(j!:): 

®!,W = (a. = + a 2^2 + “3-*^3 ( 1 ) 

Formula (1) establishes a linear isomorphism between £3, which 
is regarded as a set of vectors a, and the space of linear forms. 
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■ Putting a = eu we get the basis linear forms , which we ab¬ 
breviate to oj: 

©{ (jt) = {e^, x)=^x^, / = 1, 2, 3 (2) 

ft 

Formula (1) may be viewed as an expansion of the form ©J, rela¬ 
tive to the basis (2) and we can rewrite it thus: 

©i = a,©I -f + a3®3 (3) 

3. Let us consider the mixed product (a, x, y) in which the first 
factor is fixed and the other two vary. It is linear in each of the 


Fig. 75 



arguments and skew-symmetric in x and y {(a, y, x) =—{a,x, y)), 
so that it is a 2-form, which we denote by ©^{jr A y). From ele¬ 
mentary analytic geometry we know that the numerical value of 
©I (a: Ay) is equal to the product of the area S of the bivector 
jt A y by the length |a| of vector a and by the cosine of the 
angle a between the given vector a and the oriented (in standard 
fashion) normal of the bivectof x Ay (Fig. 75). Thus 

A y) = (a, A y) = S|a|cosa (4) 


Denote by X the matrix made up of the components of vectors x 
and y: 

" JCi Xi X3 

yi ih .V3 


x = 


and set 


V. 


Xi Xf 

yi y/ 


i, j- —1, 2, 3, 


(5) 


Then by the familiar formula for a mixed product we have 

<i>l {x Ay) = a^V^ + -+• a^F,, (6) 


13* 
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We know (see Section 7. formulas (16) to (19)) that an ar¬ 
bitrary 2-form A y) in £3 looks like this: 

©^(a; A r/) = (0,2F|2 + W|3t^l3 + ®23'^23 (7) 

Take vector a with components 

fli = (^23, 02 = — © 13 , 03 = ©12 ( 8 ) 

Then the right members of ( 6 ) and (7) coincide. 

Conclusion. Any outer form of degree two in three-dimensional 
Euclidean space can be represented as a mixed product (a,x,y), 
given an appropriate choice of the fixed vector a, which is uniquely 
defined by formulas ( 8 ). 

4. Setting a — e,, we get the 2-form ©^ , which we abbreviate 
to ©J (i==l, 2, 3). Using (2), (5) and ( 6 ) and taking into ac¬ 
count Subsections 5 to 7, Section 7, we can write 

< 0 ^ (x Ay) = F 23 = — V^32 = ®2 A ®3 (i/)> I 

Ay) = F3, = —U,3 = ©^(x) A(oj(^), I (9) 

© 5 (Ar A </) = F,2= —U2i = “lW Aco^(r/) j 

The numerical value of each of the 2-forms of (9) is equal to the 
area of the projection of the bivector x Ay on the corresponding 



coordinate plane. Geometrically it is clear that this should be so. 
For example, the product of the area of the bivector x A 1 / by the 
cosine of the angle y between the basis vector and the normal of 
the bivcctor x A 1 / is equal to the area of the projection of the bi- 
vcctor X A 1 / on the coordinate plane £12 spanned by the vectors ei 
and 62 (see Fig. 76). The length of the vector participating in (4) 
is in this case equal to unity (a = 63 ). 
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Using (9), we can rewrite the expansion ( 6 ) thus: 

©2 = a,©2 + a,^©2 -f 0 ^ 0)2 = fli©.] A co^ + 02^3 A coj + Uj©} A ©2 00) 

Formula (10) establislics a linear isomorphism between the set 
of vectors a and the space q^f 2 -fornis with arguments in £3, 
namely, to the addition of forms and id* there corresponds an 
addition of vectors: 


to the multiplication of forms by a scalar there corresponds a mul¬ 
tiplication of the vector by that scalar: 


5. A mixed product of three variable vectors {x, y, z) is an outer 
form of degree three which we denote by ©]: 

(o^(x A y A z) = (x, y, z) 

Its value on the multivector x Ay Az is equal to the oriented 
volume V of that multivector. Since the space of outer forms of 
degree three is one-dimensional (k — n, in this connection, see 
Chapter V, Section 8 ), we have 

© 2 (j:, y, z) = ^(o^(x A y A z) = ^(x, y, z) 

That is to say, an arbitrary outer form of degree three is propor¬ 
tional to the mixed product of its arguments and is uniquely de¬ 
termined by the numerical factor K. 


6 . We now show that to the outer multiplication of linear forms 
is associated the vector multiplication of vectors, namely, the fol¬ 
lowing formula holds: 

A ©i(i/) = ©fax 6 )(-»^ A t/) (11) 


where [a X &] as usual denotes a vector product. 

Proof. Using the properties of outer multiplication given in Sub¬ 
section 9 of Section 6 , we find 


©1 A ©6 = (a,©j + a 2®2 + ^ (^i“i + + *3®0 


02 03 

&2 &3 


©^ A ©3 -f 


03 a, 
b, b, 


©.^ A ©j + 


Oi 02 

b\ b^ 


©j A ©2 


( 12 ) 


Comparing (10) and (12) we get (11). 

Note that from the algebraic viewpoint the computation (12) 
coincides with the derivation of the formula for a vector product ip 
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terms of the components of the factors, which derivation is familiar 
from the elementary course of analytical geometry. 

7. Formula (11) permits of another proof that an arbitrary 
2-form o)^(x A t/) in £3 can be represented in the form of a mixed 
product (a, x, y). From Theorem 2 of Subsection 10, Section 7, 
there are two linear forms, which by Subsection 2 of this section 
we can write as coj, and toj^, such that 

( 0 * {x Ay) = a'p (x) A ( 0 ^ (y) (13) 

Putting a = [/? X 9 ]. from (4), (11) and (13) we get 
o)2(a: a y) = (ol(x a y) = (a, x, y) 

8 . Associated with the outer multiplication of a linear form into 
a 2 -form is a scalar multiplication of vectors, namely; 

( 0 ^ (x) A ©2 (t/ A z) = (a, 6 ) (D® (x A «/ A 2 ) = {a, b) (x, y, 2 ) (14) 

Indeed, by (3) and (10) we have the expansions 
= a,©l + a2®2 + “3®3* 

©I = &,©] A ©3 -f 62^3 A ©j -f 63 ©', A ©2 

Multiplying the expressions (15) termwise and using the proper¬ 
ties of outer multiplication, we get 

A ©I = (a, 6 , -f + aj}^) (©j A ©^ A ©^ (16) 

As in the case of formula (15), Section 7, it can be shown that the 
value of the 3-form ©j A ©2 A ©3 on the multivector x Ay Az is 
equal to a determinant composed of the components of the vectors 
X, y, z, that is, 

©J A © J A ©3 = ©® 

From (16) and (17) follows (14). 



(17) 



Chapter XI 


QUADRIC HYPERSURFACES 

« 


§ 1. The general equation of a quadric hypersurface 

I. Given a real n-dimensional affine space 2l„ and, in it, a sy¬ 
stem of affine coordinates with origin 0 and basis . .. 

A quadric hypersurface in 9I„ is a locus of points M e Sin 
whose radius vector x = OM satisfies the equation 

aix, x) + 2 b(x)-{-c = 0 ( 1 ) 

where a(x', x) is a quadratic form, 6 (^) is a linear form, and c is 
a constant. The forms a{x, x) and b{x) are assumed to be inva¬ 
riant under a change of basis. 


2. If we put x = OM = Xi^i + ... + x„e„, then (I) can be 
given in coordinate notation: 

'ZaikXiXk + 2'ZbiX, + c = 0 (2) 

Here, xi, ..., Xn are the coordinates of the point M. They are called 
the running coordinates and M is any point. The quadratic form 
a{x, x) = ^ aiiXiX/ is callcd’the group of higher-degree terms of 
equation (1) or (2). The linear form 

2 b{x) = 2 ZbiXt 

is called the group of first-degree terms. The constant c is called 
the constant term of the equation. 

Remark. Throughout this chapter we will denote coordinates by 
lower indices since tensor algebra will not be used at all. 

3. It may happen that for a certain equation of type (2) there 

will not be a single point satisfying it in the real space Even 
so we will say that such an equation is an equation of a quadric 
hypersurface. Sometimes the hypersurface is said to be imaginary 
(or zero). For instance, we say that the equation x^ + y^ ^ 4- 

+ 1 = 0 is the equation of an imaginary sphere (in Euclidean 
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space with a system of rectangular Cartesian coordinates x, y, z). 
These words naturally are geometrically meaningless as long as 
we remain in real space. However, such unified terminology is 
convenient from the formal algebraic viewpoint, since the subject 
of the theory to which this chapter is devoted is actually more the 
equations themselves than the hypersurfaces. In the theory of these 
equations it is best not to lose sight of any cases, firstly, because 
it is not clear beforehand whether the equation defines some 
nonempty set of points or not; secondly, even when it defines an 
empty set, the left-hand member of the equation may have some 
kind of mechanical or physical meaning. 

4. It can be proved that in a complex affine space, any equation 
of type (2) defines a nonempty set of points. However we will con¬ 
fine ourselves to real affine space 2l„. Only in a few cases will we 
speak of complex points (for instance, if a simultaneous solution 
of the equations of a straight line and a hypersurface lead to com¬ 
plex values of the desired coordinates). 

5. Equation (2) with literal coefficients is called the general 
equation of a quadric hypersurface. It contains-^(n-f !)(n + 2) 

terms. When n is large, this number becomes very great. It is 
therefore difficult to investigate directly a hypersurface on the 
basis of its equation written in an arbitrary system of coordinates. 

Procedures will be given later on that will permit reducing the 
general equation ( 2 ) to certain special forms where the equation 
is incomplete and is called canonical. 

§ 2. Changes in the left member of the equation under translation 
of the origin 

I. Denote by the symbol 2F the entire left-hand member of the 
equation of a quadric hypersurface, viewing f as a function of the 
running coordinates: 

2 /=' (x .. x„) = 2 aikXiXk + 2 S biX{ + c 

By Section 2, Chapter III, when the origin is translated, the coor¬ 
dinates vary in accord with the formula Xi = x^ + where Xi are 
the old coordinates of an arbitrary point, Xi are the new coordi¬ 
nates of that point, and Jf® are the coordinates of the origin in the 
old coordinate system. 

Substitute these expressions into the function F and regroup the 
terms, collecting those to the second and first powers of the new 
coordinates. In the process, make use of the symmetrical nature 
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of the matrix of the quadratic form (aj* = Ohi): 

.+■«!)(*.+ -»!) + 2 Z(’,(*, + *!) + c 

- Z + 2 z (? «,.< i; f,)^, + (Z o„*X. + 2 Z + c) 


If we write the function F in new coordinates in accordance 
with the old standard 


then 


2F = X daXiXk 4- 2 Z b,x, + c 


^ik — Qifc. 

bi = (Z ^ 

c = Z o-ikA^l + 2 z b^x^i + C 


(I) 

(ID 

(IID 


2. The quantity c is the left-hand member of the original equa¬ 
tion in which the coordinates of the new origin have been sub¬ 
stituted in place of the coordinates of the running point: 


c = 2f (xp 




Note that formula (11) may be written differently: 




<) 


Here, the partial derivative of F with respect to the argument Xi 
is computed from the coordinates of the new origin (jc® .jc®). 


3. To make the notation of formula (1) to (111) more compact, 
we introduce the matrices 





«ii 

• • Oi„bi 

On .. 

• ^In 

B = 

» 


• • ^nnbn 

Cln\ • • 

• ^nn 


bi . 

• ■ b,fC 


Matrix B and other matrices of order n -f- 1 will be indicated by 
a special type of print. 

Relation (I) in matrix form becomes 

A = A (la) 

Using an artificial device, it is convenient to write the function F 
as a quadratic form. To do so, we introduce a supplementary coor¬ 
dinate x„+i, using it as a conventional symbol and assuming that 
1Cn-l-l =1. 
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Besides, we assume that b,-= a< n+i = On+i i, c==a„+in+i. 
Then the formulas + xl may be written as 


Xi = i, + 

JC2= X2+ + „ 

^n= ^n + <^n+U 


( 1 ) 


The last equation shows that x„+i = 1, just like x„+i. 

Because of the introduction of this supplementary coordinate, all 
the formulas for transformation of coordinates become homoge¬ 
neous (without constant terms). What is more, the left-hand mem¬ 
ber of equation (2), Section 1, becomes homogeneous too. In the 
new notation we can then write 


n n 

2F = Yj aikXiXk + 2 S 

it k^l /=sl 


rt+l 

= Z ai^XiXk 

i,*=i 


What we have is a quadratic form with the matrix B. 

The matrix of transformation (1) expresses the old coordinates 
in terms of new ones and, in accordance with the standard, must 
be denoted as P*: 

1 0 

1 

P*= ‘ . ! 


By regarding the function f as a quadratic form of the argu¬ 
ments Xu ..., Xn+\, we can apply the familiar formula for trans¬ 
formation of the matrix of a quadratic form to get the matrix equa¬ 
tion 

B = PBP‘ (2) 

which embraces all the formulas (l)-(lll). 

The matrix formula (2) permits stating an important theorem. 
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Theorem 1. The determinant of matrix B and its rank are in¬ 
variants under a transiation of the coordinate origin, that is, 

det B = det B, rank B = rank B 

Proof. It is immediately apparent that det P* = det P = I, and 
so the theorem follows from formula (2). 

Remark. When the origin is translated, matrix A is itself an in¬ 
variant, which means all its elements are preserved. 


§ 3. Changes in the left member of the equation for a change 
in the orthonormal basis 


I. So as not to complicate computations, we henceforth assume 
that we are in Euclidean (point) space and make use of ortho¬ 
normal bases. 

Suppose we are changing from one orthonormal basis to another 
one: 



• • + 1 


•• J 


The orthogonal matrix / = ||/,j|| is written so that the first index 
increases along the row. 

The coordinates of the points transform via the formulas 


The matrix 



+ .. 




+ • • 


<■] 



.. 


/• = 

. • 

• . . 



/„. • 

• * ^nn 



( 1 ) 


is also orthogonal. 

Formulas (1) are homogeneous (they do not contain constant 
terms) since the origin remains fixed. 


2 . We now pass to new coordinates in the left member of equa¬ 
tion (2), Section 1, via the formulas (1). We have 

2F (xp .... = X “t" ^ 

where the primes indicate new coefficients. Due to the homogeneity 
of formulas (1), groups of terms of different powers transform se¬ 
parately. In particular, the constant term remains unaltered: 

o'== c 
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Similarly 

whence, in matrix notation, we get 

A' = lAr 

Since matrix / is orthogonal, its transpose is equal to the in¬ 
verse, /• = /■', and so 

A' = IAr' 

As in the case of Subsection 4, Section 2, Chapter VIl, from this 
matrix equation we get the following theorem. 

Theorem I. When changing from one orthonormal basis to 
another, the left member of the equation has as invariants det A, 
rank A, and the characteristic polynomial p(X) of matrix A. 

Remark I. The polynomial p (X) contains det A as one of its coef¬ 
ficients, and therefore the invariance of det A follows from the in¬ 
variance of p(X). 

Remark 2. When p{X) is written out in full, 

p(^)=(-ir{r-p.r-‘ + p,r-^-f ... -f(-i)'‘pj 

we see that p\, p 2 , ■ ■ ■, Pn are invariant; that is, when passing to 
a new orthonormal basis, the following are preserved; the sums 
of the principal minors of order one of matrix A, the sums of the 
principal minors of order two, and so on. Thus 

rank A' = rank A, 

P\=P .= detA' = detA 


3. To find the law of transformation of matrix B we again in¬ 
troduce the notation bi = ai „+i, c = a„+i „+i. Then 


n+l 

2F= X aikXiXk 


i, *=i 


where x„+i = 1. We now adjoin another equation to formulas (I) 
and abbreviate the resulting formulas to 


( 1 ). 


'n+l 


'■fi+l 


} 


(2) 


The matrix of the transformation formulas (2) is 

1 /ii ... /i„ 0 I 


• • • ^nn 0 

0 ... 0 1 


r= 
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It is easy to see that 

l•=^' 

Indeed, the last relation of (2) can be inverted in trivial fashion: 
it suffices merely to interchange the left and right members. As 
to formulas (1), their inversion*(s associated with taking the trans¬ 
pose of the matrix /*. 

Thus the desired formula for the transformation of matrix B of 
the quadratic form 2F is 

B' = IBr or B' = IBr' (3) 


From (3) we get 

Theorem 2. The determinant and rank of matrix B are invariant 
under a change from one orthonormal basis to another: 

detB' = detB, rank B' = rank B 

4. From (la) of Section 2, from Theorem 1 of Section 2, and 
from Theorems 1 and 2 of Section 3 follows the 

Corollary. Under a general transformation of coordinates, which 
consists in the translation of the origin and the change from an 
old orthonormal basis to a new orthonormal basis, we have the 
invariants 

detB, rankB, detA, rank A 
and the characteristic polynomial p{X) of matrix A. 

§ 4. The centre of a quadric hypersurface 

1. Ordinarily, the centre of a quadric hypersurface is taken to 
be that point of the space 5l„ relative to which all points of the 


Fig. 77 E 


hypersurface are arranged in symmetric pairs. Thus, when we 
speak of the centre, we have in mind the centre of symmetry 
(Fig. 77). 

Unfortunately, in real space this definition becomes invalid in 
cases where there is not a single point that can satisfy the equa¬ 
tion of the hypersurface. However, in those cases as well there 
may be points which for algebraic reasons it is advisable to con- 
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sider as centres. For example, the centre of the imaginary sphere 

+ 1/2 + 2* + 1 = 0 is the origin of the coordinate system. For 
this reason, we prefer to define the concept of the centre of a 
quadric hypersurface differently. 

2. Consider the incomplete equation 

'LciikXiXk-{- c = Q (I) 

If a point (xi, ..., Xn) lies on the hypersurface (1), then the 
point symmetric to it, (—jyi, ..., — Xn) also lies on the hypersur¬ 
face (1). 

Hence if there are points satisfying equation (1), then the origin 
is the centre of symmetry of the hypersurface (1). 

On this basis we give the following formally algebraic definition 
of a centre. 

The centre of an arbitrary quadric hypersurface is a point such 
that if the coordinate origin is placed at that point, then the equa¬ 
tion of the hypersurface takes on the incomplete form of (1). Thus, 
we say that the centre is any point relative to which the left mem¬ 
ber of the equation possesses central symmetry (remains unaltered 
when xi,..., jc„ is replaced by —Xi.—jc„). 

3. Given a general equation of the second degree: 

Z aikXiXk -1-2 2 biXi -f c = 0 

We want to determine whether there is a centre and if so to find 
it. We carry the origin to the point .... and obtain the 

equation 

Z aikXiXk -f- 2 2 biXi -+- c = 0 

The new origin will be the centre if and only if 

b{ = 0, i=l, n (2) 

Equations (2), with account taken of equations (II) of Section 2, 
yield the so-called equations of the centre (equations defining the 
centre): 

ZalkXl + b^ = 0 

Written out in full, the equations of the centre look like this: 

... + = I 

a^,x{+\]/+a[^x{=-b^ J 

The matrix of system (3) coincides with matrix A. 


(3) 




REDUCTION TO CANONICAL FORM 


399 


SS] 


If det i4 #= 0, then (3) has a unique solution. Then the hyper- 
surface has a unique centre. Such a hypersurface is called a central 
hypersurface. 

If det A = 0, then (3) is either inconsistent, in which case there 
is no centre (as, say, in the case of a parabola) or it is consistent, 
and then there are infinitely mafiy centres (as in the case of a cir¬ 
cular cylinder or a pair of parallel planes). 

4. The very definition of a centre suggests the first step towards 
a simplification of the equation: it is necessary to carry the origin 
to the centre. 

5. We introduce two symbols: 

6==deti4, A = detB 

The criterion of a central surface is the inequality 6^0. 

§ 5. Reducing to canonical form the general equation of a quadric 
hypersurface in Euclidean space 

1. Quadric hypersurfaces are divided into several classes for 
which we obtain distinct elementary, or canonical, forms of the 
equations. 

2. In the theoretical exposition we will not strive to save on ope¬ 
rations and will begin with a rotation of the coordinate system. In 
practical computations, if the surface is central, then as a first step 
it is best to carry the origin to the centre. 

3. We will consider a hypersurface in Euclidean space and will 

make use solely of orthonormal bases. Suppose we have the general 
equation 2F{x\, ..., x„)= 0..First consider the self-adjoint trans¬ 
formation with matrix A, which is the matrix of a quadratic form — 
the group of higher-degree terms. All roots .of the cha¬ 

racteristic polynomial of this transformation are real and there 
exists an orthonormal basis made up of the eigenvectors e\, ..., 
e', and to the eigenvector e\ corresponds the eigenvalue h, k == 
= 1, ..., n (see Chapter IX, Section 3). 

We now pass to this basis while retaining the earlier origin. 
Then the group of higher-degree terms assumes the canonical 
form 

'La^^x^x^ = X,{x\f A- ... +A,„«)2 

and the left member of the equation of the hypersurface is simpli¬ 
fied to 

2 F = K,{x\yA ... +X^{xJ + 2b[x[A ... +26X + C 
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The coefficients of the first-degree terms have changed and so they 
are primed. The constant term c remains unaltered. 


4. We now consider several specific cases. 

(1) All the characteristic roots are different from zero: 


A,] ^ 0, ..., # 0 

It is then necessary to isolate perfect squares: 






Then translate the origin via the formulas 



There will be no first-degree terms and so the new origin is the 
centre of the surface. We obtain 


where 


... +K^l = ff 


( 1 ) 


Equation (I) is canonical. 

(2) X,i ^ 0, ..., A.r 0, A.r +1 = ... = A,„ = 0; r is the rank of 
the quadratic form of the higher-degree terms; r ^ n —1. 

Here the situation becomes somewhat complicated and in order 
to avoid cumbersome computations it is necessary to carry out the 
transformation of the form of the higher-degree terms with an 
aptly chosen plan of action. 

Let us first of all find the eigenvectors corresponding to 
Xi, ..., Xr. As we know, they may be chosen so as to form an ortho¬ 
normal system e', ..., e' (see Section 3, Chapter IX). The other 
basis vectors will be determined later. 

Suppose we have the original equation in the initial coordinates; 

Z QikXiXk + 2 X biXt -f c = 0 

Its linear part is uniquely defined by specifying the vector 
b = {bi, b^, .... 

Decompose b into two components, one of which lies in the li¬ 
near hull e', ..., e', the other being orthogonal to the indicated 
linear hull: 


^ = ••• +Pr<-P 



REDUCTION TO CANONICAI. FORM 


401 


SS) 


To do this, set 

P,=(6. <). .... P, = (6. <). p = _6+ |p.^; 

The vector p thus constructed ^’s orthogonal to the subspace 
^ (fi). • • •. ^r)- 

If then we send e' along the vector p, and 

P = K 

where p is a numerical coefficient. 

The vector e' is taken to be a unit vector. It can be directed so 

n 

that we have p, = |p| or so that p = —1/7|. 

Take the vectors ..., e'_, so that together with the ear¬ 
lier constructed vectors they form an orthonormal system: 
e\, e\, ..., e', .... e'_|, e'. Otherwise the choice of the vec¬ 

tors .... e'_| is arbitrary. 

If p = 0, then .e' can be taken at pleasure as long 

as the system e\, .... e' is orthonormal. Note that in this case 
as well the equation 

P = ti< 

holds true, but p = |pl = 0. 

Thus 

ft = P,e;-f ... -f-p,.e; —pe; (1) 

The groups of second-degree, first-degree and zero-degree terms 
transform separately under a transition to a new basis. 

The higher-degree terms in the new basis take the form 

= ••• +K{x'rf 

It is natural to write the group of first-degree terms as a scalar 
product: 

Z bkXk = (b, x) 

Due to (1) 

{b, x)= (p,e; + ... + p,c; - pc', x) = p.x; + ... + p,.< — pAr'„ 

since in an orthonormal basis the scalar product of any vector by 
a basis vector is equal to the corresponding component (coordi¬ 
nate): (e', x'^ — x^. Thus, after changing to a new basis, we have 

2^ = ^i «)'+••• +^rW + 2p,<+ ... +2pX-2p< + c = 0 

The constant terin c remains unchanged. 
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We now isolate perfect squares for A = 1, ..., r; 

~ "t* Pa 

Then we translate the origin (only in the direction of the coordi¬ 
nate axes labelled 1 , ..., r): 


1 

II 

P. 

^1 ’ 

■*r+l ^r+l 

x'r = X,- 

Pr 

K ' 

< = 


The equation then becomes 

+ ... + — 2px„ = H 

If (i = 0, we get the equation 

k^x^ + ... + = H (a) 

which is canonical. If p 0 , then we write 

2px„ + H = 2ii (x„ + 

Again translate the origin, this time along the axis Jc„ by the 
amount — 5 —. The labels of the coordinates remain unchanged so 

Z\i 

as not to complicate notation. 

The equation becomes 

'K^x\-\- ... +l^f — 2iix^=0 (p) 

This equation is canonical too. 

No other cases, except those considered above, are possible. It 
now remains to list and classify the quadric hypersurfaces. 

§ 6. Classification of quadric hypersurfaces in Euclidean space 

1 . On the basis of the foregoing simplification of equations, hy¬ 
persurfaces naturally fall into the following classes: 

(1) 6 = det A ^ 0. This means that not a single X,- is equal to 
zero. This class includes all cases labelled (I), and only these 
cases. We have the canonical equation 

VH ... +K^l = H (I) 

Here and henceforth we write the running coordinates without any 
Additional labels. 
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(II) 6 = det ^ = 0, n = 7 ^ 0, r = rank A = n —1. We have the 
corresponding canonical equation 

— = 0 (II) 

(it is obtained from (P), Section for r = n —1). 

(r )6 = 0, p = 0. (Since 6 = 0, it follows that rCn). This 
class includes surfaces with canonical equations of the type (a) of 
Section 5, or 

?,,x{ + ... + = // (10 

where 1 ^ r ^ n — 1 . 

(II') 6 = 0, p^O, rCn —I. This class includes surfaces with 
canonical equations of type (p). Section 5, for r< n —1, that is, 

V?+ ••• + V?-2p^„=0 (IIO 

Here 1 r ^ n —2. 

The foregoing classes exhaust all possibilities. Cases where the 
equation is of the form (I) or (II) are basic. Cases (!') and (IT) 
repeat the basic cases, with the sole difference of being in a sub¬ 
space of smaller dimension. 

2 . Let us write down the matrices A and B for the basic cases. 
Case I. 



Definition. A quadric hypersurface is said to be nondegenerate 
if the matrix B is nonsingular, that is, if 

A = det B 0 

It is clear that the surfaces (1) are nondegenerate provided that 
H and also all surfaces ( 11 ) as well. 
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3. According to Subsection 4, Section 3, the quantities 6=deti4, 
A = det B, r = rank A, rank B, and the characteristic polynomial 
p(K) of matrix A are invariants of the left member of the equation 
in the class of orthonormal coordinate systems. All these quanti¬ 
ties can be found from the left-hand member of the general equa¬ 
tion of a quadric hypersurface specified in any orthonormal coordi¬ 
nates. Besides, we know the equations of the centre in any coordi¬ 
nates. 

Therefore, without passing to the canonical equation we can 
determine whether a hypersurface is central or not, whether it is 
degenerate or not, and we can find the set of all centres and com¬ 
pute all roots Kj of the characteristic polynomial of matrix A. 

Besides, we can determine H for a hypersurface of type (I). In¬ 
deed, from Subsection 2 we have 

-HXi ... 

Here >,i, ... Xn = 6 0, whence 



For a hypersurface of type (II) we have 

-fi% ... ^„-,=A 

But in the case A,„ = 0 the product Xi ... Kn-i taken with the 
minus sign is equal to the coefficient pn-\ of the characteristic po¬ 
lynomial p(K). Here it is necessary to bear in mind that the cha¬ 
racteristic polynomial is written as indicated in Remark 2 of Sub¬ 
section 2, Section 3. From this we find 

The radicand in (1) is positive since the existence and reality of 
have been established in the preceding investigations. 

4. For nondegenerate hypersurfaces of Case I we have 

rank/4 = /r, rankB = rt-f-l (2) 

From the first equality of (2) and from the Kronecker-Capelli theo¬ 
rem applied to the system of equations of the centre we conclude 
that all of them are central. 

Let us consider them in more detail. 


5. If Xi, ..., "Kn and H are numbers of one sign, then the hyper¬ 
surface (I) is said to be an (n —1) -dimensional ellipsoid. Its equa¬ 
tion can be rewritten thus: 


-7 + 


+ 



( 3 ) 
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The quantities Oi are called the semiaxes of the ellipsoid (Oi > 0). 
It is easy to verify that the ellipsoid is located in a parallelepiped 
defined by the inequalities jx,] < a,-, i = 1, ..., n. (For n = 3 see 
Fig 78, for n = 2 see Fig. 79), Note that a fe-dimensional ellipsoid 
for fe = 1 is an ellipse (Fig. 79); for A’ = 0 it is a pair of points 
xi = ± fli (Fig. 80). • 




It is difficult to give a pictorial representation of ellipsoid (3) 
for n > 3. However, a comparison of Figs. 78, 79 and 80 will help 
the reader to picture to himself how a fe-dimensional ellipsoid be¬ 
comes more and more complicated as the dimension k increases. 

-a, 0 a, X, 

■ ' ' ' ' " 't' ' * O ^ 


Fig. 80 


If oi = ... = Ui. =/?, the ellipsoid (3) is called an (n—1)-di¬ 
mensional sphere of radius R. 


6 . If Xi, .... X„ are of one sign and H is of another, the surface 
(I) is termed an imaginary ellipsoid. It is without points in real 
space. 


7 . If Xi.X„ are of different signs and H ^ 0, then sur¬ 

face (I) is called a hyperboloid. In this case it can be reduced to 
the form 


+ ... + 


'A-H 


4 



( 4 ) 


by dividing both members of (1) by H. 

The quantities oi, ..., a* are called the semitransverse axes and 
bi,..., b„-k are called the semiconjugate axes of the hyperboloid 
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(4) (Oi > 0, > 0). Depending on the signature of the left mem¬ 

ber of equation (4), hyperboloids have different geometric struc¬ 
ture. In elementary analytic geometry, the forms of surfaces are 



Fig. 81 


investigated by considering their sections by different planes (see, 
for instance, Figs. 82, 83). We will now apply the same procedure 




to obtain a picture of the structure of different hyperboloids. We 
consider special cases beginning with familiar objects in low-di¬ 
mensional spaces: 

(1) For n = 2, equation (4) takes the form-r=l and 

specifies a hyperbola (Fig. 81). 
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(2) For n = 3 there are two possibilities: a hyperboloid of two 
sheets (Fig. 82); 


X 


2 





X 


2 

2 



( 5 ) 


« 

and a hyperboloid of one sheet (Fig. 83): 




= 1 


(3) rt = 4. Here we have to consider three cases: 

(a) The hyperboloid of two sheets 



( 6 ) 


like the hyperbola and the hyperboloid (5), consists of two separate 
parts located in the half-spaces Xi ^ Oi and xi ^ — at. With hyper¬ 
planes of the form Xi = constant, for |jii:i|>ai, it intersects in 
ellipsoids whose semiaxes increase with increasing |a:i|. With the 
remaining hyperplanes of the form x, = constant (t = 2, 3, 4) the 
hyperboloid (6) intersects in two-sheeted hyperboloids of type (5). 

(b) The equation 



specifies a new type of hyperboloid with the feature that with every 
one of the hyperplanes of type x,- = constant it intersects either 
along a certain hyperboloid (one-sheeted or two-sheeted) or along 
a cone. (The concept of a cone \Vas introduced in Section 12, Chap¬ 
ter IV; quadric cones are discussed in more detail in Subsection 9 
below.) One must bear in mind that the typical section here is 
the hyperboloid. Cones only appear in separate hyperplanes 
(|xil=ai, |x 2 |=a 2 ); they may be viewed as degenerate hyper¬ 
boloids. Three-dimensional space does not have an analogous sur¬ 
face, but for higher-dimensional spaces this is the most typical 
case. 

(c) A hyperboloid similar to tlic one-shcetcd hyperboloid: 


+ -4 + 


a; 




= 1 


With all hyperplanes x^ = constant it intersects along two-dimen¬ 
sional ellipsoids whose semiaxes increase with increasing |X 4 |; 
with the remaining hyperplanes x, = constant it intersects in hy- 
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perboloids (one-sheeted or two-sheeted) that degenerate into cones 
if Aj = ± a,. 

It is difficult to give a drawing of hyperboloids in four-dimen¬ 
sional space. However, they may be imagined by analogy with the 
lower-dimensional cases shown in Figs. 81 to 83. Bear in mind 
that sections by hyperplanes have the form of the surfaces depicted 
in Figs. 78, 82, 83, and 25 (also see Figs. 77 and 88). 

In the general case we have the following possibilities: 

(a) rt ^ 2 , /e = 1 is a two-sheeted hyperboloid consisting of two 
parts located in the half-spaces Xi ^ Oi and Xi ^ —Oi. The hyper¬ 
planes Xi = constant (|jci|>ai) intersect it in {n — 2)-dimen- 
sional ellipsoids, the remaining hyperplanes x, = constant (i = 
= 2, ..., n) along two-sheeted hyperboloids. 

(b) n ^ 4, the number of positive and also the number of ne¬ 
gative terms in the left member of (5) is at least two. Such a 
hyperboloid intersects each of the hyperplanes x,- = constant along 
some hyperboloid of smaller dimension (even one degenerating 
into a cone). 

(c) n ^ 3, = n —1 is a hyperboloid similar to the one-sheeted 

hyperboloid. All hyperplanes x„ = constant intersect it along 
{n — 2)-dimensional ellipsoids, the remaining hyperplanes Jc, = 
= constant (i = 1, ..., n — 1), along hyperboloids or cones. 

It is easy to prove (say, by induction on the dimension of the 
space) that all hyperboloids, except two-sheeted ones, contain rec¬ 
tilinear generators. 

Also note that every fe-dimensional plane of the type Xk+\ = 
= Ck+i, ..., x„ = Cn (Cj = constant) intersects hyperboloid (4) 
in a {k — l)-dimensional ellipsoid, and among the (n — *)-dimen- 
sional planes of the type Xi = Ci, ..., xi, = (c,- = constant) 

there are such that intersect hyperboloid (4) in (n — k —l)-di- 
mensional ellipsoids. It can be proved that on the hyperboloid (4) 
there are r-dimensional ellipsoids of all possible dimensions 
r ^ max {k — \,n — k—\) and there are no ellipsoids of higher 
dimension. For n = 2, 3 this is evident from a comparison of 
Figs. 79 to 83. We will not consider the general case. 

If in addition to the given Euclidean metric we introduce into 
the space another quadratic metric with an alternating quadratic 
form, then the role of spheres there will be played by hyperboloids. 
In this connection, the two-sheeted hyperboloid in four-dimensional 
space plays an important part in the theory of relativity. 

8 . All surfaces with canonical equations of type (II) are non¬ 
degenerate and are called paraboloids. 

Every one of the paraboloids is devoid of any centre. By the 
Kronecker-Capelli theorem the system of equations of a centre in 
the case of paraboloids is inconsistent, since the rank of the basic 
matrix A is equal to n— 1 and the rank of the augmented matrix 
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is equal to n (see matrices A and B which are described in detail 
in Subsection 2). 

There are many different types of paraboloids due to the different 
combinations of signs of X,. They can be investigated as is done in 
the preceding subsection. 

9. We now consider degenerate surfaces, that is, those for 
which A = det B — 0. 

They are conveniently classified into three groups. 

(1) Case (1), provided that H = 0. Tlicn rank A = rank B = n 
and the hypersurface is central. The equation is of the form 

V?+ ••• +Va=0 (7) 


We know that a homogeneous equation of the second degree spe¬ 
cifies a cone (see Section 12, Chapter IV). If all X, are of the 
same sign, then the cone is imaginary. (Bear in mind that it has 
one real point, and that is the centre.) 

If the Xi are of different signs, the cone is called a real cone in 
the sense that it has real points besides the centre. By renumber¬ 
ing the coordinates and changing the sign of the left member of 
equation (7) it is possible to achieve Xi > 0, ^„ < 0 and see that 
the number of negative terms does not exceed the number of posi¬ 
tive ones. Then (7) reduces to 


2 . 2 2 
*1 I I ^li+i 

o I • • • “1“ •) ,2 

a; ai b\ 


"n-k 


= 0 


( 8 ) 


where The hyperplane x„ — constant = 5 ^ 0 intersects cone 

( 8 ) in an (n — 2)-dimensional -ellipsoid if k = n—\, or in a 
hyperboloid if ken —1 (this latter case is only possible for 
n ^ 4). It is easy to verify that the cone ( 8 ) consists of all 
possible straight lines passing through the origin and through 
points of the surface in which it intersects the hyperplane Xn = 
— constant ^ 0 . 

(2) Consider equation (!')• Let E* be a subspace of dimension r 
spanning the basis vectors ei, ..., er. In this subspace, equation 
(!') defines a hyperplane of type (1), which we denote by 5. 

In the case of any dimension, equation (T) defines a hypersur¬ 
face called a cylinder. Here is what we have in detail. 

Denote by E** the orthogonal complement of subspace E* 

(£** = L{er+i .^n))- Take an arbitrary vector a in E** and 

translate the hypersurface (!') by the vector a. Then in the case 
of each point only those coordinates will change that do not appear 
in (I'), that is, x„+i, ..., Jt„. Therefore the points obtained in any 
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such displacement also satisfy equation (!')• Thus, the hypersur¬ 
face (T) is obtained by a parallel transfer of S in all possible di¬ 
rections in subspace £**. This construction is a multidimensional 
generalization of how the cylinder = \ in £3 is obtained 

via a parallel displacement of a circle along the z-axis (X 3 ) 
(Fig. 84). The role of rectilinear generators on cylinder (!') is 
played by (n — r)-dimensional planes parallel to £**. 

All the cylinders (!') have infinitely many centres. It is easy to 
verify that the collection of all centres coincides with the sub¬ 
space £**. 



Fig. 84 


(3) Equations of type (IF) determine surfaces called parabolic 
cylinders. Denote by E' the (r-j- 1)-dimensional subspace spanned 
by the vectors eu ..., Cr, e„ and its orthogonal complement by £". 
In the subspace £' the equation (IF) determines a paraboloid, 
which we denote by 5, as before. The hypersurface (IF) is formed 
by a parallel displacement of the paraboloid 5 onto all possible 
vectors of the subspace E". Parabolic cylinders do not have 
centres. 

§ 7. Affine transformations 

1. We assume a system of affine coordinates has been introduced 
in an affine space ?l„ and we consider the transformation, specified 
by the following formulas, of an arbitrary point Af(xi, ,,,, Xn) 
into the point M'(^x[, .... x'): 


+ ••• + Vrt + ^n 


( 1 ) 
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We assume tliat the n X « matrix A — \\ atj || is nonsingular, that 
is, that del A 0. 

Such a transformation of is termed an affine transformation. 
Since the matrix A is nonsingular, the affine transformation is 
one-to-one. • 

It is readily verified that if two formulas of type (1) differ even 
by a single coefficient, then the affine transformations that they 
specify in some system of affine coordinates are distinct (in the 
sense of Subsection 1, Section 2, Cliapter VI). 

2. The definition of the class of affine transformations is inva¬ 
riant with respect to a choice of affine coordinates. 

True enough, for if we pass to other affine coordinates, the old 
coordinates of a point M and its image M' will be expressed in 
terms of the new coordinates by formulas of the first degree, and 
all relations that will be involved are uniquely reversible. There¬ 
fore when passing to other affine coordinates we will not go out¬ 
side the class of uniquely reversible formulas of the type (1). 

3. Suppose we have, in affine space, a geometric figure si spe¬ 
cified by an equation of the type 

f(A:) = 0 (2) 

where the symbol x denotes the coordinates of the running point. 
Let there be given an affine transformation which we denote sym¬ 
bolically as 

x' = ^(x) (3) 

We seek the equation of the image si' of the figure From (3) 
we find that x = qp"'(x'). Substituting this expression into (2), we 
get the equation 

f(<p-'(x')) = 0 (4) 

which is satisfied by all points of the figure si'. Because of the 
one-to-oneness of the transformation (3) there are no extra points 
(points not belonging to si' do not satisfy (4)). 

We do not change the coordinate system in space and so for the 
coordinates of the running point it is convenient to retain the 
original symbol x (and not x', as in (4)). Finally, for si' we get 
the equation 

F((p-'(A)) = 0 


Remark. Actually, the only thing used is the one-to-one nature 
of the transformation (3). We will therefore make use of the 
results of this subsection in Chapter Xll when considering trans¬ 
formations that are more general than affine transformations. 
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4. Affine transformations preserve the degree of the algebraic 
equations; namely, if the coordinates of the point M satisfy an al¬ 
gebraic equation of degree k, then the coordinates of the point M' 
satisfy an equation of the same degree. 

Proof. Since (1) are of the first degree, no term of the equa¬ 
tion can increase its degree as a result of the transformation. 
Neither can there be a reduction in the degree because otherwise 
we would have a rise in the degree during the reverse transforma^ 
tion. 

Corollary. Under an affine transformation, a hypersurface is 
transformed into a hypersurface. 

5. Theorem 1. Under an affine transformation, any plane of di¬ 
mension k is carried into a plane of the same dimension. 

Proof. Let a A-dimensional plane Pi, be specified by a linear 
system of rank n — k containing n — k equations. Writing this sy¬ 
stem in matrix form, we have 

Sx = s 


where S in an (n — k)'Xn matrix and s denotes the column 
matrix of constant terms. We also write down the affine transfor¬ 
mation ( 1 ) in matrix form: 

x' = Ax-\-b 

Using Subsection 2, we find the matrix equation of the image Pj 
of the plane Pi,: 

SA~'x=s (5) 

where s = s + 5A"'6 is a new column of constant terms. System 
(5) also contains n — k equations in the coordinates of the running 
point X. By Subsection 3, Section 4, Chapter II, and due to the 
nonsingularity of A, we have 

rank SA“' = rank S = n — k 

Hence, system (5) is consistent and determines a plane of the 
same dimension k, which was what we set out to prove. 

6 . It is clear that affine transformations preserve parallelism of 
hyperplanes, for if prior to the transformation two hyperplanes did 
not intersect, then neither will their images intersect; this is due 
to the one-to-oneness of the transformation, whence follows the pa¬ 
rallelism of their images (see Section 6 , Chapter III). 

7. Tlie following general assertion holds true. 

An affine transformation preserves parallelism of planes of any 
dimension. The proof is left to the reader. 
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8 . Theorem 2. An affine transformation in n-dimensional affine 
space is uniquely defined if for the inverse images there is spe¬ 
cified an arbitrary ordered system of n I points Mo, M|, ..., M„ 
in the general position, and if for their images we have an ar¬ 
bitrary analogous system Nq, N\, •.., N 
Proof. Introduce In an affine coordinate system with Mo as 
origin and with the vectors MoM.Mi,M„ as basis. (These vec¬ 

tors are independent since the points Mo, Mi, ..., M„ are in the 
general position; see Subsection 5, Section . 1 , Chapter III.) In these 
coordinates, the desired affine transformation is given by formulas 
of type ( 1 ), in which the column of constant terms consists of the 
coordinates of the point No and the column of coefficients of Xi 
consists of the coordinates of the vector NoNi. The condition 
det /1 #0 is fulfilled because the points Nj are in the general po¬ 
sition. Thus, the desired transformation exists and is unique. 


9. Consider two transformations: 

(1) The linear transformation 

X\ =auXi + ... +ai„Ji:„, 
Xn ^n\X\ • • . "F (^nnXn 

(2) The parallel transfer 

1 

= +A J 

n n ' n y 


( 6 ) 


(7) 


Under parallel transfer, all points are displaced simultaneously 
by one and the same vector {bi, ., b„}. 

Every transformation of type (1) is a composition of the trans¬ 
formations ( 6 ) and (7), and conversely. 

For this reason, every affine transformation is a composition of 
the linear transformation ( 6 ), provided that it is nonsingular, and 
the parallel transfer (7). 


10. Now consider Euclidean space. By the preceding subsection 
and Section 11 of Chapter IX, every affine transformation can be 
represented as a composition of a self-adjoint transformation, an 
isometric transformation, and a parallel transfer. 

We can speak of an isometric transformation (or, briefly, iso¬ 
metry) in a somewhat broader sense than in Chapter IX, namely, 
we can consider it not as a linear transformation, but as a trans¬ 
formation of type (1) that preserves distance between points. Then 
^oth linear transformations and parallel, transfer? are isometriea, 
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and their composition is an isometry. For this reason, an affine 
transformation is a product of an isometry and of contractions 
along n orthogonal directions. 

11. It is important to point out that all affine transformations 
of the space constitute a group, which is called the affine group 
of the space ?l„. 

To prove this, it is sufficient, by Section 2 of Chapter VI, to 
verify that: 

(1) an affine transformation is invertible and the inverse is also 
an affine transformation; 

(2) the product of two affine transformations is an affine trans¬ 
formation. 

Both properties obviously follow from Subsection 1. 

12. Definition. Two figures ^ and s^' in affine space are said to 
be affinely equivalent if one of them is an image of the other under 
some affine transformation. 

Since affine transformations constitute a group, we have the 
following properties of the affine equivalence of figures: 

(1) if si- is equivalent to then is equivalent to 

(2) if is equivalent to s4-', and si-' is equivalent to si", then 

si is equivalent to si". 

(3) every figure is equivalent to itself. 

Example. Every ellipse on a two-dimensional Euclidean plane is 
affinely equivalent to the unit circle. 

It is left to the reader to prove that in the space ?l„ any two 
planes of the same dimension k{\^k^n — 1) are affinely 
equivalent. 

§ 8. Affine classification of quadric hypersurfaces 

1. We have established that in the affine space 5l„ all quadric 
hypersurfaces are distributed into a finite number of classes so 
that within one class all surfaces are affinely equivalent to one 
another. This distribution into classes is called the affine classifi¬ 
cation of quadric hypersurfaces. (One also speaks of a classifica¬ 
tion relative to the affine group.) 

2. Take an arbitrary quadric hypersurface and reduce its equa¬ 
tion to canonical form. Algebraically, this means that we trans¬ 
form the left member of the equation via certain formulas of type 
(1), Section 7. If we regard these formulas not as formulas for the 
transformation of coordinates, but as an affine transformation, 
then the resulting canonical equation yields a hypersurface that 
is affinely equivalent to the original one. If one more affine trans- 
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formation is carried out, namely a contraction along the directions 
of the coordinate axes, tlien all nonzero coefficients can be reduced 
to +1 or —1 in any canonical equation. 

For this reason, in cases (I) and (F), Section 6, we obtain, for 

# 0 , * 

... ±,v;= 1 (I </•</!) (1) 

in cases (1) and (T), Section 6, we obtain, for H = 0, 

... ±.v;’ = 0 (l<r<n) (2) 

and in cases (II) and (IF) we get 

±a:2± ... + a-2-2a:„ = 0 (l<r<rt-l) (3) 

Surfaces specified by distinct equations of type (I), (2) and (3) 
cannot be carried into one another via an affine transformation 
because of the law of inertia of quadratic forms. Distinct equations 
here are those which cannot be carried one into another by multi¬ 
plication by (—1) and renumbering the coordinates. 

We have thus obtained the desired classes of affinely equivalent 
hypersurfaces, each of which has its representative among the 
equations (1), (2), (3). 

§ 9. The intersection of a straight iine with a quadric 
hypersurface. Asymptotic directions 

1. Given a hypersurface 

2F = Y.aikX,Xk + 2Y,bkXk^-c = 0 (1) 

and an arbitrary point Alo with coordinates ..., x:°). Draw a 
straight line through Mo in the direction of a vector / =: 
= {/i, 

We now seek the points of intersection of the straight line and 
the hypersurface (1). 

The coordinates of the running point Af on the indicated line 
are given by the equations 

— oo < T < oo (2) 

To find the points of intersection it is necessary to solve simul¬ 
taneously the equations (1) and (2). Putting (2) in (1), we get 

p Z “M + 2t (Z + Z+ 2'' w.<) _ 0 (3) 

It is necessary to investigate equation (3). 

2 . If then (3) is a quadratic equation. In this 

case there are two points of intersection; these points may be two 
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distinct real points, two distinct complex conjugate points, and, 
finally, coincident points. In the latter case we say that the 
straight line has a multiple point of intersection with the surface. 

Example. On the Euclidean plane, the circle -f i/^ = 1 and 
the straight line x = —2 do not have real points of intersection 
(Fig. 85). A simple computation yields the coordinates_of the com¬ 
plex conjugate points of their intersection (—2, ±/Vs)- 

In order to see these points, consider a circle x^ = \ on 
a two-dimensional complex plane. For a model of a two-dimen- 
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sional complex plane we take four-dimensional real space (see Sec¬ 
tion 11, Chapter I). Set 

X = u 4- ts, 1/ = 0 + ill (4) 

Putting these expressions into the equation of the circle and sepa¬ 
rating the real and imaginary parts, we get 

rtH-UTl = 0 1 

System (5) shows that the circle x^= 1, considered on the 
two-dimensional complex plane (4), is depicted in the four-dimen¬ 
sional space of the variables («, v, |, q) as the intersection of a 
hyperboloid and a cone. 

The straight line x = —2 is depicted in the space of variables 
(«, V, I, q) as a two-dimensional plane; 

u = -2, 1 = 0 (6) 

Consider in four-dimensional space the three-dimensional sub¬ 
space 1 = 0. The plane (6) is completely contained in it, while the 
circle (5) intersects this subspace in the figure 

w2 + t,2-q2=l, j 

qti = 0 J 

consisting of the ordinary circle 

q = 0 


(7) 
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(it is precisely this circle that we see on the real Euclidean plane) 
and of the hyperbola 

= 1 , 0 = 0 

(Fig. 86). The plane (6) and t[jejigure (7) intersect in the same 

points M = —2, 0 = 0, ii=±V3 liavc just been under dis¬ 
cussion. 

3. If 

S — 0 (8) 

then for (3) we will have either a first-degree equation, or a con¬ 
tradictory equation, or an identity. In the first of these three cases. 



we say that the straight line intersects the hypersurface once in 
a finite point and the next time at infinity. In the second case, we 
say that the line has a double* intersection at infinity with the 
hypersurface. In the third case, the line lies entirely on the hyper¬ 
surface. In all three cases, we say that the line has an asymptotic 
direction relative to the given hypersurface. The asymptotic direc¬ 
tion is given by the vector / = {/i, ..., U}, provided we have (8). 

All straight lines having asymptotic directions and passing 
through a single point form a cone (Fig. 87). From (2) and (8) we 
obtain the equation of the cone of asymptotic directions, the vertex 
of which is located at the point Mo: 

La^^{x,-x°)(x,-xl) = 0 

From the equations of the centre it follows that if Mo lies at the 
centre of a hypersurface and the vector / has an asymptotic direc¬ 
tion, then equation (3) assumes the form 

0-t2 + 0-t + 2F(a-», 4) = 0 


14-661 
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Then if .^n) 0- follows that the straight lines 

forming this cone do not meet the hypersurface at a single finite 



Fig. 87 

point. These straight lines may be called asymptotic lines, and the 
cone an asymptotic cone. 



(a) (b) 


Instances are the asymptotic cones of hyperboloids in three-di¬ 
mensional space (Fig. 88) and the asymptotes of a hyperbola for 
n = 2. 

§ 10. Conjugate directions 

I. Suppose a vector / has a nonasymptotic direction. Then any 
straight line passing in the direction of I intersects the hypersur¬ 
face in Iwo points Mi and M 2 . Let Mo be the midpoint of the chord 
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M 1 M 2 . For the definition of the midpoint of a line segment in the 
case of a real affine space see Sulj^section 3, Section 8, Chapter III. 
If Ml, M 2 are complex conjugate points, then the midpoint Mo of 
the chord M 1 M 2 is to be understood as the point whose coordinates 
are the arithmetic means of the coordinates of the endpoints. Mo is 
a real point in this case as well. 

Consider all straight lines parallel to the vector I and on each 
of them find the midpoint of the chord M 1 M 2 . It turns out that the 
locus of these midpoints is a hypersurface. 

We now prove this. We have 


MoM| = Ti/, MoM2 = T2/ (1) 

where t 2 = —ti. If .Wi and M 2 are real points, then the equality 
T 2 = —Ti is evident geometrically. If Mi, M 2 are complex conju¬ 
gate points, then in place of each of the vector equations (I) we 
can write n coordinate equations. From them too it will follow im¬ 
mediately that T 2 = — ti. 

Thus 

Ti + T 2 = 0 ( 2 ) 

Let us go back to equation (3) of Section 9. Since the straight 
line has a nonasymptotic direction, the coefficient of the square of 
the unknown is nonzero. By Vieta’s theorem (the sum and pro¬ 
duct formulas of Vieta) and because of (2), the coefficient of the 
first power of the unknown in equation (3), Section 9, must vanish. 
Therefore 

Sa,.>^(. + ZV.“0 (3) 

i 

To obtain an equation for all midpoints, we have to assume 
that Mo is any midpoint and then consider its coordinates as the 
running coordinates (xi, ..., Xn). Then from (3) we have 


Put 


^aiJkXi -f- = 0 


Ni — ^ciiklk, D — YibkU 


(4) 


Then (4) becomes 


+ ••• + ^ = 0 (5) 


It is easy to prove that there are nonzero numbers among the Ni. 
Indeed, suppose 


( 6 ) 


14* 


Ni — X — Q 
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for all t = 1, .... n. Multiplying equations (6) by h and adding, 
we obtain 

contrary to the hypothesis made at the beginning of the section. 

Since there are nonzero numbers among the numbers A^i,..., A^n, 
(5) is the equation of a hypersurface. 

2 . Hyperplane (5) is called the diametral hyperplane conjugate 
to the direction I with respect to the given hypersurjace. 

It bisects every chord parallel to /. 

3. The numbers A^i, A/„ form the coordinates of a vector 

N=[Ni . N„}. 



We assume the coordinates in space to be orthonormal. Then 
vector N is orthogonal to the diametral hyperplane (5), that is to 
say, it is its normal vector. 

The relations 


^{ = lla{kk n) 

may be regarded as the linear transformation 

N = Al 


which carries vector / into vector N (Fig. 89). 

This is precisely the self-adjoint linear transformation that 
engaged us when we investigated the general equation of a quadric 
hypersurface. 

4. We know that in the process of reducing the equation of a 
hypersurface to canonical form it is necessary to direct the coor¬ 
dinate axes along the eigenvectors of the transformation A. 
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It is now easy to see why, for geometric reasons, it is precisely 
these directions that are advantageous for simplifying the equa¬ 
tions of a hypersurface. • 

Suppose for the sake of simplicity we are considering a proper 
direction that is not asymptotic. Then the conjugate diametral hy¬ 
perplane exists and is perpendicular to this direction. It is there¬ 
fore the plane of symmetry of the given hypersurface. 

From this, at least in the case of a nondegenerate central hyper¬ 
surface, it is clear that by reducing it to canonical form we take 
for the coordinate planes the orthogonal system of its planes of 
symmetry. 
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§ I. Homogeneous coordinates In affine space. Points at infinity 


1. We consider the n-dimensional affine space In it is given 
a system of affine coordinates with origin O and basis ei, .... e„. 
Let xi . Xn be the coordinates of an arbitrary point M. 

We now introduce into 9l„ the so-called homogeneous coordi¬ 
nates. 

We say that I,. In+iiln+i ¥= 0) are the homogeneous 

coordinates of a point M (a:i, ..., x „) if 


li 

In + l 




in 

In+l 


0 ) 


It is clear that M is defined by the numbers (|i.|„+i). 

We will write M ... |n+i)- It is also clear, in turn, that M 

does not fully determine its homogeneous coordinates. Indeed, if 
we multiply all the homogeneous coordinates by one and the same 
nonzero number, the point will not change. In other words, the 
set of numbers (|i, ..., |„+i) and the set of numbers (A,|i, ..., 
Xg„+i). for X = 7 ^ 0, determine one and the same point. 


2. By what has been said, homogeneous coordinates of an ar¬ 
bitrarily taken point depend on the choice of the affine system of 
coordinates, that is, on the choice of the origin 0 and the basis 

e . . Let us change to a new origin and a new basis. Then 

tlie affine (nonhomogeneous) coordinates will change via formulas 
of the form 

x'\=Q\[Xi-\- ... -f- QinXrt + Qi rt+i 


x'n — QnlX\ -{■ ... QnnXn ■¥ Qn n+\ 

where the constant terms Qi n+i, Qn n+i are the new coordi¬ 
nates of the old origin 0; the matrix IIQjjil (i,/ =1, ..., n) is 
determined in familiar fashion from the old and new bases (see 
Section 5, Chapter 11). As we see, formulas (2) are generally not 
homogeneous. 
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From (i) and (2) we get appropriate formulas for the transfor¬ 
mation of homogeneous coordinates: 

K =Qiili+ ... + ^ Q, 


K =Q/.l^l+ ••• + + 

^rt+l~ ^n+l 


(3) 


where the last equation = is taken at pleasure. We could 
have written 



= Q,,l, + •.. 

^1 rt+l^n+i> '1 


K 

= ••• 

n+l^n+1’ | 

Ifi+I 

1 (4) 


where K is any nonzero scalar. 

We see that the formulas for transformation of homogeneous 
coordinates are themselves homogeneous, that is to say, they do 
not have constant terms. 


3. Let us consider an arbitrary hyperplane in ?l„: 

AiXi ... +-f/!„+, = 0 

Passing to homogeneous coordinates, we get the equation 

'^lll + • . . + + /Irt+lSn + I = 0 (5) 

Thus, in homogeneous coordinates a hyperplane is determined 
by a first-degree homogeneous equation. Accordingly, a plane of 
any dimension k is determined by a system of homogeneous linear 
equations of rank r = n — k. 


4. We consider in 9l„ an arbitrary quadric hypersurface: 

n n 

£ “I" 2 ^ Ai "}* 0 

i, /i«l 

Passing to homogeneous coordinates, we get 


I, /»! »*«*l 


or 


n+l 
a, j5*il 


We again have a homogeneous equation. We can already see the 
convenience of using homogeneous coordinates since the equation 
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of the quadric hypersurface is written in more compact form be¬ 
cause its left member is a quadratic form. Actually, we have al¬ 
ready employed homogeneous coordinates in Section 2 of Chap¬ 
ter XI and found them to be useful. 

5. To the affine space 9l„ let us adjoin new elements which we 

will call points at infinity {ideal points). We do not offer any 
pictorial descriptions and will merely regard a point at infinity as 
an entity determined by homogeneous coordinates (|i, ..., |n+i). 

provided that 

£,.+.=0 (7) 

and that among the numbers (|i, .... |„) at least one is different 
from zero. Also, we assume that the sets (X|i, ..., ?i|„, 0) for all 
X 0 specify one and the same ideal point, while nonproportional 
sets define distinct points. 

We will say that a certain ideal point belongs to a given hyper¬ 
plane or a given quadric hypersurface, and so on, if its coordinates 
satisfy the equation of the hyperplane or the equation of the hy¬ 
persurface, etc. Under a change of the original affine coordinate 
system, we assume that the coordinates of an arbitrarily chosen 
ideal point change via the formulas (3) or (4), where the last line 
is the identity 0 = 0. 

6. We will assume that every homogeneous linear equation in 

the coordinates ... g„+i, that is, every equation of type (5), de¬ 

termines some hyperplane. Equation (7) is also such an equation. 
Accordingly, all ideal points are viewed as constituting a hyper¬ 
plane, which is called the ideal hyperplane. 

7. In 2l„ let us take two parallel hyperplanes 

A,|,-f ... =0 

and 

... + +A"^,|„^, = 0 

The intersection of each of them with the ideal hyperplane is 
given by one and the same system of equations 

Ah + . . . + Anln = 0. 1 

t _o \ (8) 

whence it follows that parallel hyperplanes have common ideal 
points. Since the system (8) has rank r = 2, the set of all ideal 
points common to both parallel hyperplanes is to be considered an 
(ideal) plane of dimension n —2. (See Fig. 90; here and henceforth 
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we will depict elements at infinity (ideal elements) and the in¬ 
tersections of geometric figures at ideal points in a conventional 
fashion.) • 

8. Assuming that every homogeneous system of linear equations 
of rank r = n — k specifies a plane of dimension k in homogeneous 
coordinates, we can establish, as was done in the preceding sub¬ 
section, that two parallel planes of the same dimension k intersect 



Fig. 90 Fig. 91 


at infinity in an ideal plane of dimension k — 1. In particular, any 
two parallel straight lines intersect in a single ideal point 
(Fig. 91). 

9. The affine space ?l„ thus augmented by ideal elements is 
termed an n-dimensional projective space. However, it would be 
more exact to say that this is one of a number of concrete models 
of rt-dimensional projective space, a general description of which 
is given in the next section. 

§ 2. The concept of a projective* space 

1. Let us consider a set of objects whose nature and exterior 
aspect are immaterial. All we assume is that each of these entities 

is uniquely specified by an ordered set of numbers (^i.in+i). 

These entities will be called points and each will be denoted in 
the usual manner, for instance, Af(|i, ..., |n+i). 

This set will be called an n-dimensional projective space, de¬ 
noted Pn, if the two following conditions hold. 

(A) Any ordered set of numbers (^i, ..., |n+i) determines a 
point Af(^i, ..., |n+i) if at least one of the numbers |i, ..., |„+i 
is nonzero. An (n-j- l)-tuple consisting solely of zeros does not 
determine any point. 

(B) If X is a scalar not equal to zero (X 0), then two propor¬ 
tional sets (|i, ..., |„+i) and (X|i, ..., X|„+i) determine one and 
the same point in P„. Nonproportional sets determine distinct 
points in P„. 
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The numbers |i, ..., |n are called the homogeneous coordinates 
of the point M in P„. 

Important remark. The foregoing is viewed as a definition of an 
n-dimensional projective space. However, ordinarily the definition 
of a projective space includes the description of certain special 
subsets, called ^-dimensional planes {k = 0, 1, ..., n), the system 
of these subsets having definite properties. These properties are 
expressed by the axioms of the projective space. In one and the 
same set of points, subsets of this kind (planes) may be designated 
in different ways or, as it is common to say, it is possible to spe¬ 
cify distinct projective structures on one and the same set. We 
allow ourselves to restrict the definition of a projective space to 
only two axioms, (A) and (B), because later we will introduce 
systems of sulisets called planes in accordance with a very definite 
standard (via systems of linear equations), and we will always 
adhere strictly to this standard. 

Remark. We do not give the axioms of a projective space, for 
the axiomatic definition of a projective space is more involved 
than the familiar axiomatic definition of a linear space that was 
given in Section 1, Chapter 1. 

2. We stress from the very start that a projective space is not 
a vector space for the reason that no linear operations will be de¬ 
fined in it. 

3. A projective space is said to be real if for its points 
M(^i, ..., ^n+i) only real values of the coordinates (|i, ..., |„+i) 
are admitted. If also complex numbers are taken for the |ft, then 
the projective space is said to be complex. 

4. Given the relations 

= Qnli + ••• +Q| 


••• +Qrt+i„+il„+i 

provided that the (n-f 1)X(«+ 1) matrix Q= IIQ,-,-II is nonsin¬ 
gular and X ^ 0. 

Then, starting with an arbitrarily specified set of numbers 
(Si, In+i), we can determine a new set of numbers ... 

..., + And if among the numbers I* there is at least one non¬ 

zero numlier, tlien there will also be a nonzero number among the 
This is due to the nonsingularity of the matrix Q. 

We will assume that the numbers (|', ..., are the new cg- 
ordinates of a point which had earlier been defined by the coordi- 
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nates (|i, |„+|). Also, as is readily seen, requirements (A) 

and (B) hold true for the new sets of nqpibers g'. For any speci¬ 
fication of the matrix Q we will regard formulas (1) as formulas 
for the transformation of the coordinates. For the present we will 
not look into the geometric meaning of these transformations (it 
will be examined in the next section). 

Thus, alongside the original coordinates |i, ..., we intro¬ 
duce many other coordinate systems, to which we give the name 
projective systems of coordinates in ?„■ F,ach new system is defined 
by a specification of the matrix Q. 

The original coordinates |i.|„+i do not have any preferen¬ 

tial advantage over the other coordinate systems introduced by the 
formulas (1). Indeed, (1) may be inverted and then the old coor¬ 
dinates will be expressed in terms of new formulas of the same 
type. Besides, if using formulas like (1) with matrix Q we pass 
from the coordinates |i, ..., |„+i to the coordinates ... 
and then, via similar formulas with matrix Q we pass from the 
coordinates %' .to the coordinates Ei, .... i„+i, then 

li, ..., |„+i will be expressed in terms of |i, ..., |„+i via formulas 
of type (1) with matrix QQ. In short, the equivalence of all the 
indicated coordinate systems is a consequence of the fact that 
transformations of type (1) constitute a group (the familiar group 
of linear transformations of variables). 

5. Formulas of type (1) may be viewed from another stand¬ 

point. We may say that the system of coordinates does not change 
but that the point Al(|i, ..., |„+i) itself is transformed into the 
point M'(l\, .... Regarded from this standpoint, formu¬ 

las (1) specify a certain one-to-one transformation of projective 
space. Any transformation of this type in the space P„ is termed a 
projective transformation. All projective transformations of the 
space Pn (that is, such that correspond to all possible nonsingular 
matrices Q) constitute a group. It is called the projective group 
of the space P„. 

6. It would be a mistake to think that the projective group of P„ 
is isomorphic to the group of nonsingular (n -f l)X(2t+ 1) niat- 
rices. The point is that formulas of type (I) with matrices Q and 
aQ (where a is any nonzero scalar) determine one and the same 
projective transformation. In particular, when Q = aE, a # 0, we 
obtain, independently of a, an identity projective transformation 
which leaves all points fixed. 

7. The subject of projective geometry, that is, the theory of pro¬ 
jective spaces, involves objects, properties, and quantities that are 



428 


PROJECTIVE SPACE 


[CH. Xll 


invariant with respect to the projective group. Let us consider 
some invariants of the projective group. 

8. We use the term hyperplane in projective space P„ for any 
set of points that is defined, in a given coordinate system, by a 
homogeneous equation of the first degree: 

+ ••• +'4„+|^„+i = 0 (2) 

We transform equation (2) in accord with the projective trans¬ 
formation (1) (see Subsection 5). Inverting (1) we get 

(3) 

Substituting this expression into (2), we obtain 

... = (4) 

where (|i.|n+i) denote the running coordinates of a point 

in the same coordinate system in which the hyperplane (2) is 
specified, and the coefficients A'k are expressed by the formula 

K = (5) 

Since there are nonzero numbers. A, 0, among the At and the 
matrix P is nonsingular, it follows that there are nonzero numbers 
among the A'f^. Therefore (4) is not an identity. Consequently, 
(4) is a first-degree equation. Conversely, if we substitute (1) 
into (4), then, taking into account (5), we get (2). We see that 
points that satisfy (2) are carried into points that satisfy (4), and 
conversely. For this reason, the images of the points of hyper¬ 
plane (2) fill the entire hyperplane (4). 

Conclusion. Under a projective transformation, every hyperplane 
is carried into a hyperplane. 

Thus, the set of all hypcrplanes in Pn is an entity that is inva¬ 
riant with respect to the projective group. Therefore hyperplanes 
are part of the subject matter of projective geometry. 

Remark. The transition from equation (2) to (4) may be viewed 
otherwise, namely as a change to the equation of the same hyper- 
plane in another system of projective coordinates. From this it is 
evident that the equation of a hyperplane is linear and homoge¬ 
neous in all projective coordinate systems. It is easy to see that, 
generally, the projective invariance of a class of entities is equi¬ 
valent to the invariance of the class of equations of these entities 
with respect to a transition from one projective coordinate system 
to another. 
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9. We use the term ft-dimensional (projective) plane in for 
any set of points that is determined in m given coordinate system 
by some homogeneous linear system of equations of rank r = 
= rt — k: 


fliili + ••• + ^I n + I^a +1 

n +1 +1 0 


( 6 ) 


Denote the rectangular matrix of system (6) by A. Transform¬ 
ing (6) via the formulas (3), we get 



' • “1" ^1 n+l^n+l 


'• “^^rn+l^n + l 



(7) 


Let A' be the matrix of system (7). As in Subsection 9, Section 5, 
Chapter Iff, we have, from (3) and (6), 

A' = ^AP' (8) 

From (8) it follows that rank A' = rank A and so (7) specifies a 
plane of the same dimension k. 

From these manipulations it is clear that under a projective 
transformation a fe-dimensional plane goes into a ^-dimensional 
plane. Thus, the set of all ^-dimensional planes in P„ is an entity 
which is invariant with respect to the projective group. Therefore, 
^-dimensional planes come within the subject matter of projective 
geometry. 


10. Two planes in projective space are said to be skew if they 
do not have any points in common. 

In Pn let us consider planes Pi, and Pi, of dimensions k and /, 
specified by systems of homogeneous linear equations, and let us 
combine all their equations into a single system. If the combined 
system has only a trivial solution, then Pi, and Pi are skew, since 
the set (0, ..., 0) does not determine any point, otherwise the 
planes intersect. From this it is easy to compute that two planes 
in P„ can be skew only if the sum of their dimensions is less 
than the dimension of the space: 

kA-l<n 

From Subsection 9 and from the one-to-oneness of projective 
transformations it follows that under projective transformations 
intersecting planes go into inter.secting planes and skew planes go 
into skew planes. 

Remark. When supplementing affine space with ideal points, 
it may happen that planes which are skew in 'Sn, may become in- 
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tersecting planes of the supplemented space (concerning skew 
planes in 5l„, see Section 7 of Chapter III). 

11. We define a quadric hypersurface in the projective space f*„ 
as any set of points determined in a projective coordinate system 
by a second-degree homogeneous equation: 

^ a,iU,=0 

As in the preceding case, we can prove that the set of all quadric 
hypersurfaces in P„ is an entity that is invariant with respect to 
the projective group. For this reason, quadric hypersurfaces belong 
to the subject of projective geometry. 

12. Projective geometry treats of the properties of hyperplanes, 
^-dimensional planes, quadric hypersurfaces, etc., that are inva¬ 
riant under any projective transformations. The dimension of a 
plane is one such properly. 

13. A one-dimensional projective plane is called a projective 
line. 

Let us consider an arbitrary straight line in P„. It is determined 
by a homogeneous linear system of equations of rank n —1 in 
n + 1 variables. Hence, in the given case the fundamental set of 
solutions consists of two independent solutions. We denote them 

by (ui.«„+i) and (ui, ..., Un+i). They are associated with 

two points U, V on the line. Let M(ii.|„+i) be an arbitrary 

point of this line. Since every solution is linearly expressible in 
terms of the fundamental solution, we have 

li = \iu^ + vv^, i=l. «-fl (9) 

where p, v are certain numbers not simultaneously zero. 

Formula (9) expresses the geometric fact that a straight line is 
uniquely determined by any two of its points and that a straight 
line can be drawn througli any two points U,V ^ Pn. 

The numbers p, v may be viewed as the homogeneous coordina¬ 
tes of point M on the given line. At the same lime we conclude 
that a straight line in projective space is itself a one-dimensional 
projective space. 

Confining ourselves to the case of a real space Pn, we put 

I 

0 — —====, pc = cosa, VC = sin a 

-f V* 

Then from (9) we have 

— Ui cos a -j- sin a 


(10) 
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where are the coordinates of the same point Af. Varying a 
from — 00 to + oo, we get all possible pijints on the given straight 
line. Here, due to the periodicity of sine and cosine, every point M 
will be repeated an infinitude of times. Formula (10) shows that a 
simple closed curve, say, that of an ordinary circle can serve as 
a pictorial model for a real projective line. Actually, when a varies 
from 0 to ji, A1 runs through the entire projective line once and 
returns to its original position. Observe that when an affine space 
is supplemented with ideal points, every line is supplemented with 
a single point, which is what makes it a closed curve (Fig. 92). 

oo 

^ ___ ^ 



Fig. 92 


A similar representation of a real two-dimensional projective 
plane in the form of a sphere is erroneous. In this connection, see 
Subsection 11 of the next section. 

14. In the case of a ^-dimensional projective plane, in place of 
(9) we have 

h = + ti2«f * + • • • + «<■''+" (11) 

where {«/'*}, •••> (Mj*''''*} constitute the fundamental set of 

solutions of a linear system of equations of type (6), and pi, .... 
Pft+i are numbers, some of which are nonzero. They may be 
viewed as the homogeneous coordinates of a point in a ^-dimen¬ 
sional plane. 

From this it is clear that a ^-dimensional plane in P„ is itself 
a A:-dimensional projective space (real or complex according as 
Pn is real or complex). 

15. Now let us return to the arbitrary line (9) in projective 
space P„. Together with points U, V wc consider another two 
points on this same straight line: A1 with coordinates 

I, = pH,-fw, (12) 

and N with coordinates 

TJi = ii'ii + vti, 

We assume they differ from U and V. The number 

V . V 


(13) 
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is called the cross ratio (or anharmonic ratio) in which the or¬ 
dered pair of points M, N divides the ordered pair of points U, V. 

To denote g we use the symbol {UVMN). Thus 

(t/yAlA^) = -J:4- (14) 

Observe that each of the simple ratios — and is not de- 

termined by the points U, V, M and U, V, N. This is clear because 
the homogeneous coordinates of any one of these points may be 
multiplied by any nonzero number. 

However, the cross ratio has a very definite numerical value, 
which depends solely on the specification, on the line, of the or¬ 
dered pairs of points U, V and M, N. 

Indeed, if the coordinates of the points U, V, M, N are multi¬ 
plied respectively by four arbitrary factors 0, then the fractions 
v/|i and v/ji will be multiplied by one and the same factor, which 
is cancelled out in (14). 

N __ M 

x^ U Xj y 

Fig. 93 

What is more, a cross ratio remains unaltered when passing to 
any new system of projective coordinates of the space. This is 
equivalent to the fact that a cross ratio is invariant under any pro¬ 
jective transformations. The proof of this is given in Subsection 16. 

Let us consider in more detail the geometric meaning of the 
cross ratio. Take four distinct points V, y, M, N on an ordinary 
straight line. Assume that on this line an affine coordinate x has 
been introduced that takes on the values Xi, X 2 , Xz, X 4 at the res¬ 
pective points (Fig. 93). Passing to homogeneous coordinates 
(I, q) via the formula a: = |/ti, q ^ 0, and setting t) = 1, we 
have the following homogeneous coordinates of the points in 
question; 

U{X^, 1), V(X 2 , 1). Mix^, 1), N{X 4 , 1) 

Then (12) and (13) become 

jf3 = px,-f VAr2, \ j;4 = ji V|-f vj:2, ■) 

l = p-l-v J I=ji-)-v j ^ ^ 

From the systems (15) we find p, v, ji, v and, substituting them 
inlo (14), we get 

Xs — Xi . X4 — X1 


{UVMN) 


Xz — Xz Xf — X4 


( 16 ) 
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From elementary analytic geometry we know that the fractions in 
the right member of (16) are the ratios K aTid k in which the points 
M and N respectively divide the segment UV\ 


Thus 


UM X3 — X1 7 _ UN X4 — Xt 

MV Xi — x^' NV X2 — X, 


(UVMN) = k:l = 


UM . 

MV * NV 


(17) 


for any four distinct points on the affine line. 

16. The cross ratio of two pairs of points is an invariant quantity 
with respect to the projective group. 

To prove this, consider an arbitrary projective transformation. 
By formulas (3) we have 

Since the matrix || || is nonsingular, from (12) and (18) we 

get 

i; = p«; + vt»; (i9) 

where uj, I', are the coordinates of the points U', V, and M', 
which are images of the points U, V and M, respectively. In similar 
fashion we obtain the coordinates of the image N' of N: 

q' = Am; + vv' (20) 


From (19) and (20) we get 

{U'V'M'N') = I = (UVMN) 


which completes the proof. 

17. At the conclusion of the preceding section we remarked that 
an affine space ?l„ supplemented with ideal elements is a model 
of the general concept of an n-dimcnsional projective space. It 
must be stressed that if we consider a supplemented affine space 
as a projective space and admit any transformations of type (1), 
Section 2, then we must not treat ideal elements preferentially and 
must regard them as being on a par with ordinary elements, since 
we can carry any such ideal point into an ordinary point via a 
projective transformation. Also note that the definition of a pro¬ 
jective space does not isolate any elements as being at infinity. 
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Therefore the set of ideal points in the supplemented affine space 
is not an invariant entity with respect to the projective group. For 
this reason, the concept of points at infinity (or ideal points) does 
not come within the sphere of projective geometry. 

18. By means of a transformation of coordinates in P„ we can 
make any preassigned hyperplane have an equation |„+i = 0 and 
we can agree to consider it to be infinitely distant. The indication 
of precisely which hyperplane is taken for the ideal hyperplane 
can be regarded as a return from projective space to affine space. 

19. If homogeneous coordinates are introduced in affine space, 
then, as can readily be verified, any affine transformation is given 
by formulas of type (4), Section 1, with a nonsingular matrix of 
coefficients being affine. A comparison of (4), Section 1, with (1) of 
this section will make it clear that the affine transformations of 
the space ?l„ may be taken as a special case of projective trans¬ 
formations in a supplemented affine space, that is, in P„. Namely, 
we can regard as affine all transformations of type (1) that pre¬ 
serve ideal points as ideal points. 

Indeed, if from the condition |„+i = 0 we necessarily obtain 
|J,+i = 0, then the projective transformation must have the 
form (3), Section 1. If we are only interested in the points of the 
space %n itself, then |„+i ^ 0 and from (3), Section 1, we get 
formulas (2), Section 1, that express an affine transformation. 

Important corollary. The group of all affine transformations in 
?l„ is a subgroup of the projective group of the space P„. 

Remark. For this formulation it is very important that we agree 
to regard as affine certain special projective transformations. We 
have thus included affine transformations in the category of projec¬ 
tive transformations. 

Due to the fact that the range of projective transformations is 
richer than that of affine transformations, the projective group has 
fewer invariants than the affine group; every invariant of the pro¬ 
jective group is an invariant of the affine group, but the converse 
is not true. For example, each of the ratios (17) is an affine inva¬ 
riant but is not a projective invariant (the latter follows without 
computations from Subsection 3, Section 5). 

On the other hand, in contrast to affine geometry (and all the 
more so to metric geometry) projective geometry treats of the pro¬ 
perties of geometric figures that are more stable in the sense that 
they are preserved under all transformations of a more extensive 
group. 

20. The material of Subsections 8 to 19 serves to illustrate the 
brief statement contained in Subsection 7. 
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§ 3. A bundle of planes in affine space 

% 

1. Consider in (n + 1)-dimensional affine space 2l„+i the set of 
all planes of all dimensions (this includes straight lines) passing 
through a fixed point 0. This set is called a bundle (sheaf) of pla¬ 
nes with centre 0. From now on we will denote both the bundle 
and the centre by O. 

Take O for the origin of an affine system of coordinates with an 

arbitrary basis ei .e„+i. We leave the origin unchanged, and 

so we will identify ?l„+i and the corresponding linear space L„+i. 

Every straight line of the bundle is uniquely determined by spe¬ 
cification of a point M other than O. Let ii, ..., In+i be the coor¬ 
dinates of M in the basis eu ..., ffn+i- Then for an arbitrary K ^ 0 
the point ?t|i.XIn+i determines the same straight line OM. 

A straight line regarded as an element of the bundle will be 
called a point, the numbers |i, ..., |„+i are its homogeneous co¬ 
ordinates. Then it is clear that the set of such points (that is, of 
straight lines of the bundle O) constitutes an n-dimensional pro¬ 
jective space P„. It is also clear that every (k l)-dimensional 
plane of the bundle O is a A-dimensional projective plane in Pn, 
since it passes through the origin and, consequently, is determined 
by a system of homogeneous linear equations of rank r = 
= (n -f 1) — (k -j- 1) = n — k. 

We have thus obtained another geometric model of a projective 
space Pn, this time in the form of a bundle in an affine space of 
dimension n-f 1. Unlike Section 1, here we do not require any 
adjoining of new points. All elements of the set under considera¬ 
tion (straight lines of a bundle) are geometrically equivalent. 

2. We now consider jointly both of our models of an n-dimen- 
■sional projective space. This will help to illuminate the geometric 
meaning of transformations of coordinates and projective trans¬ 
formations in Pn, which were defined algebraically in Section 2. 

For an n-dimensional affine space ?l„ let us take in the space 
2l„+i a hyperplane that does not pass through the centre of the 
bundle 0. To 5l„+i we adjoin ideal elements in accord with Sub¬ 
sections 5 to 8, Section 1. Then 9l„+i will turn into the projective 
space P„+i and the hyperplane will become a projective space 
of dimension n, which we denote S„. 

Every straight line a of bundle 0 (the line is supplemented with 
an ideal point) intersects in some point A (Fig. 94). We will 
say that the point A corresponds to the line a. This correspondence 
is one-to-one because of the adjunction of ideal elements to 
namely, if a straight line in affine space is parallel to the hyper¬ 
plane ‘2l„, then A is an ideal point (Fig. 95). 
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Let OM be a direction vector of line a, OM = + • • • + 

+ ln+i^n+i- The numbers X|i, .... X|n+i, A, 0 , which are propor¬ 
tional to the coordinates of the vector OM, will be taken for the 
homogeneous coordinates of point A in which corresponds to 
line a (Fig. 96). They coincide with the homogeneous coordinates 
of this line that are defined in accordance with Subsection 1. 



It is clear that the choice of basis cj, ..., e„+i in Sl„+i fully 
determines the projective coordinates in the bundle 0 and in the 
hyperplane This system of projective coordinates is preserved 
under a similarity transformation of the basis Ci (i.e. when chang¬ 
ing to a basis of the form aei.a^n+i, a = 5 ^ 0 ). 




Note the particular case where the vectors ei, ..., are parallel 
to ?l„. in this case, we denote by O' the point of intersection of the 
hyperplane ?l„ with the line of bundle O whose direction vector is 
e„ 4 .i. in VI„ we introduce affine coordinates with origin O' and 

basis . .. (Fig. 97). Then the affine coordinates Xu .. ■, Xn 

of an arbitrary point /I e ?l„ and its homogeneous coordinates 
|i. •. •, In+i will be related by the formulas (1) of Section 1, that 
is, Xi = i = 1. n, |„+i =?t 0. 



S3) 


A BUNDLE OF PLANES IN AFFINE SPACE 


437 


3. Now let us return to the general case. Let e', .... be a 
new basis in 2l„+i. Then 

rt+I 

ei=llQiiei, ( 1 ) 

/=! 

If OM is an arbitrary nonzero vector on the straight line a having 
in the old basis the coordinates |i, ..,, |„+i and in the new basis 
the coordinates Ij, 1'^,, then 

n+l 

1-=ZQuIi, /=!....,«+! ( 2 ) 

But the numbers |i, ..., |„+i are the old homogeneous coordinates 
of A and the numbers g', ..are its new homogeneous coor¬ 
dinates in For this reason the formulas (2) may be viewed as 
a transformation of the homogeneous coordinates of points in a 
supplemented hyperplane (and also, at the same time, in the 
bundle 0) under the just described change in the coordinate sys¬ 
tem of the space 2J„+i. In place of (2) we can write (1), Section 2, 
since homogeneous coordinates are defined up to the proportio¬ 
nality factor. 

4. Now suppose a transformation of projective coordinates is 
specified via (1) of Section 2 in an n-dimensional projective 
space P,i- Identifying P„ with the bundle 0 in '&„+\ and setting 
X = 1 in (1), Section 2, we obtain formulas (2). They may be 
viewed as formulas of a transformation of coordinates in 2l„+i un¬ 
der which the origin remains fixed and the basis is transformed 
by formulas (1). 

5. To summarize, then, we can'say that the projective coordi¬ 
nates in Pn are determined by specification of the basis e\ . envx 

in ?ln+i, and formulas (1) of Section 2 express the change to a 
new projective system of coordinates in P„, which system is given 
by the new basis e'l, ..., e'n+i in 9ln+i. It is immaterial here which 
of the two models of P„ are considered: the bundle O or the hyper- 
plane 

6 . Formulas (1), Section 2, can be similarly interpreted if we 

assume that they specify a projective transformation in P„. To do 
this we have to consider in an affine transformation that 

leaves the point O fixed and is given by (2). Then an arbitrary 
point M e 9l„+i with coordinates gi ..., gn+i goes into the point 
Af'(g', ..., I' + i). If we are not interested in the point Af' itself. 



438 


PROJECTIVE SPACE 


ICH. Xtl 


bul only in the line OM' into which OM goes, then in (2) we can 
multiply the numbers |i, ..., g„+i or the numbers , 1'+, by 

any number X Q. If we put the factor % in the left members of 
formulas (2), then we get formulas (1) of Section 2. 

Thus we can say that any projective transformation in the 
augmented hyperplane is induced by an affine transformation 
in which leaves fixed the point 0 (or, what is the same thing, 
by a nonsingular linear transformation in Z-„+i). 

If we take the bundle 0 for a model of the projective space P„, 
then we can say that the projective transformations in P„ are 
merely affine transformations in 2l„+i that preserve the point O, 
bearing in mind that points in Pn are the straight lines of the 
bundle 0. 

7. To get a complete picture of what has been said about the 
model of a projective space in the form of a bundle O in 2l„+i and 
about projective transformations on this model we advise the 
reader to picture himself observing the space 2ln+i from the centre 
of the bundle 0. Then all points of the space lying on a single ray 
of vision will appear to be a single point. Then if the reader, lo¬ 
cated at 0, observes the movement of points under a nonsingular 
linear transformation in L„+i = he will actually see a pro¬ 
jective transformation in the bundle or in any hyperplane not 
passing through point O. 

Note that distinct linear transformations Ax and ^lAx (p any 
nonzero scalar) in 2l„+i are one and the same projective transfor¬ 
mation in the bundle 0. 

8 . As an application of the foregoing constructions, let us look 
into the conditions that uniquely determine a projective transfor¬ 
mation in Pn- We give the following definition which will be use¬ 
ful in the sequel. 

Definition. A set of r + 1 points in P„ is in the general position 
if the points do not belong to a single (r— l)-dimensional (pro¬ 
jective) plane. 

It is obvious that the total number of points in such a system 
cannot exceed n + 1. 

Remark. If for we consider a bundle 0 in ?l„+i, then the 

points a . . a,n e P„ are in the general position in the space P„ 

if and only if the direction vectors of the straight lines ai, ..., 
a„. e?l„+i are linearly independent in 9ln+i. This is evident if we 
recall that, in such a model, ^-dimensional (projective) planes 
of P„ are {k -f 1)-dimensional planes of the bundle 0. 

In P„ let there be arbitrarily given a set of n -f 2 points 
fli, ..., a„ 4 i, o „42 such that any n-f- 1 of the points are in the ge¬ 
neral position. Again in P„ let there be arbitrarily given a similar 
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set of points ai.a|,+ i, an+ 2 . We will grove that the following 

theorem holds true. 

Theorem 1. There exists a unique projective transformation of 
the space Pn that carries a, into a' for all i = I, n 2 . 

Proof. For a model of the projective space P„ we again take a 
bundle 0 in ?ln+i- In O let the straight lines Oi, ..., a„+i, a „+2 be 
given as inverse-image points, and the straight lines ai, .... 
a'n+o as image points. Take a (nonzero) direction vector e „+2 
of the straight line a „+2 and any (nonzero) direction vectors Si of 
lines a< (i = 1, ..., rt -f 1). By hypothesis, the lines Oi, ..., On+i, 
being points of P„, are in the general position. Hence, the vectors Sf 
(i = 1, ..., rt + 1) constitute a basis in Sl„+i. Therefore we have 
the expansion 

Sn +2 = h^l+ ••• +^n + l^»i+l 

We now prove that not a single one of the numbers Ki is zero: 

0 , 1=1, ..., n -f- 1 (3) 

Let us assume the contrary. For example, let Xi = 0. Then en +2 is 
linearly expressible in terms of the vectors 62 , , en+\ and so the 

system 62 .e„+i, e „+2 is linearly dependent, which contradicts 

the hypothesis, since the points 02 , ..., a„+i, a „+2 s Pn are in the 
general position. Thus At ¥= 0. The remaining inequalities of (3) 
are established in a similar manner. 

Set e, = Xiei, i = 1, ..., n + 1. Then 

^n+2 = ^l+ ••• +«rt+I (4) 

From the independence of the vectors e\ . Cn+i and the ine¬ 

qualities (3) it follows that the vectors ei, ..., Cn+i are also inde¬ 
pendent. In similar fashion it can be proved that there are vec¬ 
tors e'i lying respectively on the straight lines a'l such that 

en +2 = e'i ... -f- <?,',+! (5) 

and e'u •••. e'„+i are linearly independent in 2ln+i, whence and 
also from Subsection 6, Section 3. Chapter VII, it follows that in 
2 l„+i = Ln+i there is a nonsingular linear transformation x' == Ax 
for which e'i = Aei, i = 1, ..., n + 1. Using (5), we find that 

en +2 = e'i-{- ... en+\ = Aei... Aen+\ 

= /4 (C| -F ... 4- e„+i) = Ae „+2 

Thus e'i = Ad for all t = 1, ..., n 4- 1, n + 2; that is, the linear 
transformation x' = Ax in carries the given straight lines 
a,, ..., an +2 into the given straight lines a'\, .... a'n +2 and, hence, 
induces the desired projective transformation, which we denote 
by f, in the bundle 0 . 
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We prove uniqueness. Let there be another projective transfor¬ 
mation q) in the bundle 0 for which (p(a,) = ft;, i = 1, ..., n + 2. 
By Subsection 8 it is induced by a nonsingular linear transforma¬ 
tion B of space ?l„+i, which transformation also carries lines 
Qi into a\ (t = I,..., n -f 2). Then 

e'\ = Bei = HiBi (6) 

where p,- are certain numbers; t = 1, ..., n -f 2. Here, due to (4) 
and (6), we have 

en+ 2 = B^rt +2 = flei + ... Ben+\ —-\r ... +tin+i^n+i (7) 
On the other hand, using (5) and (6), we find that 

Cn+2 = V^n+2en+2 = Prt+2e! + . . . + Pn+2Cn+l (8) 

From (7) and (8) follows 

M'n+2=til = ll2= ••• =tla+l (9) 

since the vectors e'[, ..., e'n+i are independent. Putting pn +2 = ti. 
we get 

Bet = ne'i = p/4e,, / = 1, ..., n + 1 

But then Bx = iiAx for any vector x in ?ln+i, but this means that 
(p = f, which completes the proof of Theorem 1. 

9. We have already pointed out that the coordinates in P„ are 
determined by a specification of the basis in 2ln+i. But in ?l„+i the 
basis is an entity that is exterior to the space P„. It is desirable 
to have a different way of specifying projective coordinates that is 
based solely on a consideration of entities of the projective space 
itself. 

Suppose in P„ we have an arbitrarily chosen and fixed set of 
n -f- 2 points 

'4|, /lo, ..., Ai+i> B (10) 

such that any n -|- I points are in the general position. 

Theorem 2. In an n-dimensional projective space P„ there is a 
unique set of projective coordinates in which the points (10) have 
the following homogeneous coordinates: 

A (1. 0, ..., 0), 

A 2 (0, 1.0), 

. (11) 

Ai+i (0. o> ■ • •. 1). 
d (1, 1, .... 1) 
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Proof. For a model of P„ let us consider the bundle 0 in 2ln+i. 
As in the proof of Theorem 1, we finU the direction vectors 

ei, ..., e„+i, e „+2 of the straight lines Ai. B of bundle O 

such that 

(a) ^n +2 = ^i+ ••• +^n+l. 

(b) ei . Bn+i are linearly independent in 9l„+i. 

We take the vectors et, .... e„ ,i for a basis in 9l„+i. Tlien, by 
Subsection I, the homogeneous coordinates of the points will be 
determined in P„ and conditions (11) will be fulfilled (the last one 
due to (a)). 

We now prove uniqueness. Let a projective coordinate system in 
P„ be specified by a different basis eu .... Cn+i of the space 9l„+i 
and let conditions (11) hold again. Then, by virtue of (II), the 
vectors ei, e„+i are necessarily direction vectors for the 
straight lines Ai . An^■\ of bundle 0 , that is, e,- = a.^i (aj = 5 ^ 0 , 

n+l 

I = 1 , ..., n + 1). Besides, «i = an+ 2 ^n+ 2 . an +2 = 7 ^ 0 since the 

1=1 

vector ^1 -f ■ • • + ^n+i must be a direction vector for the straight 
line BeO. Then, as in (7) to (9), it is established that ai = .., 
... = ttn+i, whence it follows that the homogeneous coordinates 
li, ..., |„+i of point X that are specified via the basis Ci, ..., e„+i 

are proportional to its coordinates |i.|„+i derived from the 

basis ei, ..., ^n+i- The proof of Theorem 2 is complete. 

10. Remark. To illustrate the geometric meaning of Theorem 2 
let us consider, as a model of P„, the augmented affine space 9ln, 
assuming that the homogeneous coordinates gi, ..., |„+i are in¬ 
troduced, in accordance with Subsections 1 to 5 of Section 1, on 



the basis of a certain affine system of coordinates a:i. x„ in 2ln. 

Then the points having coordinates (11) play the following roles: 

A„+i(0, ..., 0, 1) serves as the origin of the affine coordinate 
system in 2l„; 
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Ai,A 2 ,---,^n are ideal points of the coordinate axes Xi.Xq, .... 
Xn respectively; their choice determines the direction of the coordi¬ 
nate axes; 

the point B(l,..., 1, 1), called the units point, defines a choice 
of basis vectors on each of the axes xj. x„ (Fig. 98). 

11. We conclude this section with a very popular geometric 
model of a real projective space P„. We assume that a Euclidean 



Fig. 99 Fig. 100 


metric has been introduced into 2l„+i, and together with the bundle 
O we consider an n-dimensional sphere 5 with centre 0 (for a 
definition of an n-dimensional sphere see Subsection 5, Section 6, 
Chapter XI). Each straight line of the bundle intersects the sphere 


Fig. 101 


in two diametrically opposite points (for n = 1, see Fig. 99). We 
identify such points, that is, each pair of diametrically opposite 
points of the sphere 5 will be regarded as one point of a new set. 
In construction, the set thus obtained is in one-to-one corres¬ 
pondence with the bundle O and, hence, may be taken as a 
space P„. In this model, every fe-dimensional projective plane is 
depicted in the form of a ^-dimensional sphere with identified dia¬ 
metrically opposite points, since each (/j + 1)-dimensional plane 
of bundle O intersects the sphere S along some sphere of dimen¬ 
sion /{ (see Fig. 100 where n = 2, k — 1). 
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Note that in the particular case n = 1 in this case alone) 
it is possible to depict P„ as a one-dimensional sphere, that is, in 
the form of a circle without identifying diametrically opposite 
points. Namely, on a two-dimensional plane consider a bundle 0 
(in other words, a pencil of straight lines passing through point 0) 
and a circle S passing through the same point (Fig. 101). With O 
on the circle S we associate a straight line o tangent to the circle 
at this point. With any other point /I e S we associate the straight 
line 0/1. We obtain a one-to-one correspondence between the lines 
of bundle 0 and the points of circle S that allows us to regard 
the circle as a model of a projective line (in this connection see 
Subsection 13, Section 2), 


§ 4. Central projection 

1. Above we considered projective transformations in a given 

space Pn- This concept can be generalized to the consideration of 
a projective mapping of one n-dimensional projective space P„ 
onto another n-dimensional projective space P!,, on the assumption 
that this mapping is given by formulas (1), Section 2, with the 
proviso that (ii,..., g„+i) are the coordinates of the inverse image 
in the space P„, and (^1,..., are the coordinates of the 

image in the space Pn. In this section we consider an important 
special case called central projection. 

2. From Subsection 10, Section 2, it follows that in P„ any 
straight line and any hyperplane intersect. 

It is easy to verify that if the straight line a does not lie entirely 
in the hyperplane P, then they have a unique point in common. To 
prove this, adjoin the equation of the hyperplane P to the system 
of equations of rank n —1 that define a. If a and P have two 
distinct common points, then the combined system of equations has 
two independent nontrivial solutions so that its rank is r ^ n—1. 
But this is only possible if the equation of the hyperplane P is a 
consequence of the equations of the straight line a, that is, when 
acz P. 

3. Now suppose in P„ we have chosen two different hyperplanes 
P and P' with an arbitrary fixed point 0 that does not belong to 
either of these hyperplanes. 

Pass a straight line OM through an arbitrary point Af in hyper- 
plane P and through point 0. By Subsection 2, the line OM will 
intersect P' in a unique point M', and for every point M' e P' 
there will be a unique point M e P such that M, 0 and M' are 
collinear (Fig. 102). 



444 


PROJECTIVE SPACE 


ICH. XII 


We call M' the projection of M from the centre 0 on the hyper¬ 
plane P'. We will also write M' — f(M). The one-to-one mapping 
M' = f(M) of hyperplane P onto hyperplane P' is called the 
central projection of P on P' from the centre 0. 

4. In place of the hyperplanes P, P' we can consider two planes 
Pk, P'k of the same arbitrary dimension k and determine the central 
projection of one of them on the other; but a special mutual ar¬ 
rangement of the planes Pk, P'k and the point 0 is required. 



Fig. 103 



For instance, suppose that P'k and O belong to a (k 1)-di¬ 
mensional projective plane Pi,+i in the space P„. We will also 
assume that the planes P* and P'k do not coincide and that the 
point O does not belong to them. Then the arguments and con¬ 
structions of Subsections 2 and 3 are fully applicable, since Pk 
and P'k may be regarded as hyperplanes in a (fe -}- l)-dimensional 
projective space, which is what the plane Pk+\ (see Fig. 103 in 
which k = i) is. 

5. Theorem. The central projection of plane Pk onto plane P* 
(I ^ ^ n — 1) is a projective mapping. 

Proof. We know that P* and P* can be regarded as projective 
spaces (Section 2, Subsection 14) and that the central projection 
M' = f(M) is a one-to-one mapping of Pk onto P'k (for k ^n — 2 
due to the special mutual arrangement of the planes). It is there¬ 
fore sufficient for us to determine the type of formulas that specify 
this mapping. 

In P„ we choose a system of coordinates so that point 0 has co¬ 
ordinates fi— ... =|” = 0, |rt+i=5^0. By formula (11) of Sec¬ 
tion 2, an arbitrary point M e Pk has coordinates 

= ... +p,+,«r'> 

where «'/' are certain numbers. Similarly, the arbitrary point 
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M^ePfthas the coordinates ^ 

i, = nX"+ ... 

Here, ni,..., jia+i may be considered the homogeneous coordinates 

of M inside Ph, and nf.ni+i llie homogeneous coordinates 

of M' inside P*. If 0, M, M' are collincar, then 

i=\ . n+l ( 1 ) 

Here a = 7^0 and p #0 since none of the points M(lt) and 
M'(I'i) coincides with the point 0{l"). F^r /== I, .... n we get 
all +Pli = 0buta|„+i + p|^+i= 5 ^= 0 . Conversely, the equations (1) are 
ensured for certain numbers a when i = 1 , ..n, and if 

a|rt+i +Pln+t ¥= 0 , then the points Mih) and M'{^t') are coll inear 
with point 0. Thus, if we want to find the projection of M' 
from the given point Al, we have to solve the system of equa¬ 
tions 

... = ... + 

i= 1 , 2 . n ( 2 ) 

assuming n,, ..., ujP, v</' and to be given and |j,', 

..., (ife+i to be sought. Then for a ^ 0 and p = 7 ^ 0 we can take any 
numbers, provided that X = — y. Besides we must have 

i‘X'l.+ +(■»*,«") <3) 

We know definitely that the point M' exists and is unique. There¬ 
fore system (2) is uniquely solvable and inequality (3) holds. 
Besides that, the matrix made up of the columns uj'’, ..., 
where i = 1 , 2, ..., n, has rank k^-\- 1 (see Subsection 14, Sec¬ 
tion 2). And so it has a basis minor D of order ft -f 1. For the sake 
of simplicity, let us suppose the minor D is composed of the first 
ft+ 1 rows. Then in (2) we take the first ft -f 1 equations and 
discard the rest. Solving this system by Cramer’s rule, we get 

Pi = + ••• H-^^^P*+i)i 

. (4) 

^ t Oft + l I .. I I Pki-xk+x 

Pl! + 1— -5 Pi + ••• T- Q Pli + lJ 

where Dji is a determinant obtained from D by replacing the /th 
column with the column «</' (/=!, .... ft+1). We see that, to 

within the factor X, the pj. Pa+i are expressed by linear and 

homogeneous formulas in terms of pi.ps+i. The coefficients of 
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these formulas, that is, ^D/i, constitute a nonsingular matrix, 

for otherwise it would be possible to find all nonzero numbers 
Pi, ..., pft+i for which p| = ... = p*+i = 0. But this means that 
a certain point M e Pi, does not have a projection on P'k, contrary 
to the hypothesis. We have thus established that the central pro¬ 
jection Pft onto p'k from centre 0 is given by the homogeneous 
linear formulas (4) with a nonsingular matrix of coefficients. The 
proof of the theorem is complete. 

Thus, central projection is a special case of a projective 
mapping. Whence, apparently, stems the term projective mapping. 

6. The reader is advised, by way of an exercise, to prove that if 
a (one-to-one) central projection of the plane P* onto the plane P* 
from centre 0 is possible, then P/,, P* and O lie in a (^-1- l)-di- 
mensiona! plane of the space P„. 

7. The results obtained in Section 2 on projective transforma¬ 
tions automatically carry over to the case of projective mappings, 
and so from the theorem proved in Subsection 5 we have the fol¬ 
lowing proposition. 



Given a straight line a in the plane P* and a straight line 
a e Pk, which is the image of a under a projection of P* onto P* 
from a centre 0. Also suppose that U, V, M, N are four points 
on a; U', V', M', N' are their projections on a' (Fig. 104). Then 

{U'V'M'N') = {UVMN) 

which is to say that the cross ratio is an invariant under central 
projections. 

§ 5. Projective equivalence of figures 

1. Two figures in a projective space P„ (that is, two sets con¬ 
sisting of points, straight lines and fe-dimensional planes) are 
said to be projectively equivalent if one of them is carried into the 
other by means of a projective transformation. 
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Since all projective transfer nations form a group, we have the 
following propositions. * 

(1) If a figure si- is projectively equivalent to a figure si-', then 
si-' is equivalent to si. 

(2) If figure s4- is equivalent to sl\ and si' is equivalent to 
si", then ^ is equivalent to si". 

(3) Every figure is equivalent to itself. 

In projective geometry projectively equivalent figures are not 
distinguished, just as, in metric geometry, congruent figures are 
indistinguishable. 

2. Theorem I. In n-dimensional projective space, any two planes 
of the same dimension are projectively equivalent. 

Proof. Given an arbitrary A-dimensional plane P/, specified by 
the system of equations 


flllll + ••• 

+ ^^1 n + lEn + l — 0i 1 


flrlll + • • • 

4-nrn + lln + l =0 ' 

( 1 ) 


the rank of which is equal to the number of equations (r = n — k). 
Consider the projective transformation 


|i = null 

+ 

... fli„+i|rt+i, 


+ 

... -|-Or n+lln + l, 

Ir+l =nr+i 

ill + 

... -j-flr + l n + l|n+l, 


ill + 

... fln + l rt+lli»+l 


where the coefficients of the expressions E^+i, |n+i are taken 
at pleasure so long as the matrix of the transformation (2) is non¬ 
singular (such a choice is possible since the system (1) has rank 
r = n — k). The plane Pt, of transformation (2) is carried into a 
completely definite plane Si=0, ..., |r = 0, which we denote 
by Pft. As in the preceding case, it is established that any other 
^-dimensional projective plane P* is projectively equivalent to the 
plane Pi, whence it follows that Pj, and P/, are equivalent. 

3. Let us consider the projective line. On this line all ordered 
triads of distinct points are projectively equivalent by Theorem 1, 
Section 3. Let us now see under what conditions quadruples of 
points are projectively equivalent. We will prove the following 
theorem. 
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Theorem 2. Ordered quadruples of points U, V, M, N and U', V', 
M', N' on a single line are projectively equivalent if and only if 
their cross ratios are equal: 

(UVMM) = {U'V'M'N') (3) 

Proof. The necessity of (3) follows from the projective inva¬ 
riance of the cross ratio. We prove that it is sufficient. Let (3) 
hold. By Theorem 1, Section 3, there exists a projective transfor¬ 
mation f such that 

U' = f{U), V' = f{V), M' = f(M) 

Put f(N)=N". Then (U'V'M'N') = {UVMN) = (U'V'M'N') due 
to the projective invariance of the cross ratio and condition (3). 
But if distinct points U', V', M' and the cross ratio (U'V'M'N') = 
= g are given, then N' is defined uniquely since its homogeneous 
coordinates are uniquely (up to a factor) expressed by the for¬ 
mulas of Subsection 15, Section 2, in terms of g and the coordi¬ 
nates of U', V' and M'. Therefore N" = N', and the proof is com¬ 
plete. 

4. We say that an ordered pair of points MN divides an ordered 
pair of points UV (located on the straight line MN) harmonically 
if 

(UVMN) = — 1 (4) 

In this case we also say that the quadruple of points U, V, M, N 
is a harmonic set and that the point N is the fourth harmonic 
point for the (ordered) triad U, V, M. 

The harmonicity of the harmonic set of four points is preserved 
if we: » 

(1) interchange the pairs of points M, N and U, V-, 

(2) interchange the points within any one of these pairs. 

These properties follow immediately from formula (4) and the 

formulas of Subsection 15, Section 2. 

6. Let the affine line be supplemented with an ideal point N. 
Consider the segment AB on this line; denote the midpoint by Af. 

Theorem 3. The two points MN harmonically divide the two 
points AB. 

In other words, the midpoint of the segment AB is the fourth 
harmonic point relative to A, B, N, where N is an ideal point. 

Proof. Introduce the affine coordinate x on the straight line so 
that X = 0 at point A and a: = 1 at point B. Then x = 1/2 at 
point M (Fig. 105). Also introduce the homogeneous coordinates 
(g, T)) putting .V = |/t). We then obtain the following homogeneous 
coordinates of the points under consideration: A(0, 1), M(l, 2), 
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6(1, 1), N(l, 0), whence, via formulas (t2)-(14), Section 2, we 
get 

{ABMN) = —l 

6. Now let us consider some figures on a two-dimensional pro¬ 
jective plane. 

A three-poini is a set of three points, none of which is collinear, 
and three straight lines joining the points in pairs (Fig. 106). 


AfO) m(-j) B(1) 

/ 

/ 

/ 

I 

\ NM 

--O'---- 

Fig. 105 Fig. 106 

Since under a projective transformation a straight line goes into 
a straight line, it follows from Theorem 1 of Section 3 that any two 
three-points on a projective plane are projectively equivalent. 

7. The notion of a three-point affords a good pictorial illustra¬ 
tion of certain geometric properties of a real projective plane. We 
give these properties without taking up the proofs. 




Fig. 107 



Fig. 108 


One three-point divides the entire real projective plane into four 
triangles labelled /, 2, 3, 4 in Figs. 106 and 107 (in Fig. 107 the 
projective plane is depicted as a sphere with identified diametrical¬ 
ly opposite points). 

If a polygon is decomposed into triangles on an affine plane, it 
is possible to choose the sense of traversal of each of the triangles 
so that the traversals of any two adjacent triangles are associated 


15-661 
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with opposite-sense motions along their common side (Fig. 108). 
This choice of traversal of the triangles may be made in two differ¬ 
ent ways, which is in keeping with the two different orientations 
of the affine plane. 

A projective plane broken up into triangles does not admit such 
a choice of matched traversal of all triangles. For example, if we 
match the traversals of the first and second triangles (Fig. 107), 



and then the second and third triangles, the traversals of the first 
and third triangles will not be matched. 

For this reason, we say that the real projective plane is non- 
orientable. 

Remark. On an affine plane it is impossible to arrange not only 
four but even three triangles with sides adjoined in the fashion 
observed in the three-point. However, it is possible in three-di¬ 
mensional affine space to construct a model of the mutual arran¬ 
gement of any three triangles of a three-point with the aid of the 
so-called Mobius strip (a surface pasted together from a twisted 
rectangle, as shown in Fig. 109). A fourth triangle of the three- 
point can be pasted to the edge of the Mobius strip to obtain a 
model of the entire projective plane in the form of a surface if we 
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allow for deformation of the pasted triangles and self-intersection 
of the surface. This self-intersection can be eliminated by moving 
into four-dimensional space. 


8. A figure in the projective plane composed of four points, of 
which no three are collinear, and six straight lines joining the 
points in pairs is termed a complete quadrangle. 

The indicated points are called vertices and the straight lines 
joining them in pairs arc said to be the sides of the quadrangle. 
Fig. 110 shows a quadrangle ABCD. Sides without a common 
vertex are called opposite sides. The quadrangle ABCD has three 
pairs of opposite sides; AB and CD, AC and BD, BC and AD. The 



points of intersection of opposite sides are called the diagonal 
points of the quadrangle. In Fig. 110 the diagonal points are P, 

Q, R- 

From Theorem 1, Section 3, it follows that any two quadrangles 
are projectively equivalent (whereas sets of five points on a pro¬ 
jective plane are. generally, not equivalent). 

Observe that all three diagonal points of the quadrangle have 
equal status in the sense that any one of them can be carried into 
another by a projective transformation that carries the quadrangle 
into itself. For instance, if we want to carry point P into Q (see 
Fig. 110), it suffices to take a projective transformation f under 
which 

f{A)==A, f{B) = C, 
f(C) = B, f{D) = D 


Then the line AB goes into AC and CD goes into BD so that the 
points P and Q are interchanged, and the quadrangle ABCD is 
transformed to this same quadrangle. 


9. Given a quadrangle ABCD. Draw through two of its diagonal 
points, P and Q, a straight line and denote by E and F the points 
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of its intersection with the two sides of the quadrangle that pass 
through the third diagonal point R (Fig. 111). 

Theorem 4. If the foregoing construction has been carried out, 
the following quadruples of points are harmonic sets: 

P, Q, E, F- /I, D, E, R: B, C, F, R 

Proof. We assume the projective plane is obtained by adjoining 
ideal points to the affine plane. Perform a projective transformation 




that carries A, B, C and D into the vertices A', B', C and D', res¬ 
pectively, of a parallelogram. Then P will go to the centre P' of 
the parallelogram A'B'C'D' (Fig. 112), Q and R will go to the 
ideal points Q' and R' of lines A'B' and A'D', respectively. Line 



Fig. II3 


PQ will go to the straight line parallel to A'B', the points E and 
F to the midpoints E' and F' of the opposite sides A'D' and B'C', 
and P' will be the midpoint of the line segment E'F'. Therefore, 
taking into account Theorem 3 of Subsection 5 and the projective 
invariance of a cross ratio, we have 


(PQEF) = (P'Q'E'F') = — 1, 
(ADER) = {A'D’E'R') = — 1, 
(BCFR) = {B'C'F'R') = — 1 


which is what we set out to prove. 


10. We now show how it is possible, geometrically, to construct 
the fourth harmonic point N for the three given points U, V, M on 
a Euclidean plane (i/, V, M are three distinct collinear points). 
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Through U draw a line perpendicular^to UV and lay off on it 
line segments AU = UD (Fig. 113). Denote by B and C the points 
of intersection of DM and AM with the perpendicular to UV that 
passes through V. Then lines DC and AB will clearly intersect UV 
in one point, which is the desired point (by Theorem 4). 

§ 6. Projective classification of quadric hypersurfaces 

I. Theorem. For two quadric hi/pcrsurfaces in a real space P„ 
to be projectively equivalent, it is necessary and sufficient that the 
left members of their equations have the same ranks and equal {in 
absolute value) signatures. 

Proof. Given the hypersurfaces 

a(|, |) = 0, b{l,l) = Q (1) 

where a(g, |) and b{l, g) are quadratic forms in the homogeneous 
coordinates (|i, ..., |„+i) = |. 

The geometric meaning of each of the equations (1) remains 
unaltered if both sides of the equation are multiplied by —1. We 
can therefore assume that the canonical form of each of the quad¬ 
ratic forms does not contain more negative terms than positive 
terms. Then the signature is positive and the equality of ranks and 
signatures of the two quadratic forms is equivalent to the equality 
of their ranks and the positive indices. Taking this argument into 
account, let us prove first the sufficiency and then the necessity. 

(1) Let the quadratic forms a(|, |) and ft(|, |) have one and 
the same rank r and the same positive index k. 

We consider the quadratic form 

= ... - ... -fr ( 2 ) 

and, together with it, the third hypersurfacc c{l, |)= 0. 

We know that there exists a nonsingular linear transformation 
of the variables which carries the form a(|, |) into a form of type 
(2). This means that there is a projective transformation which 
carries the hypersurface fl(|, |)=0 into the hypersurface 
<^(l. I)= 0. that is to say, the indicated hypersurfaces are projec* 
tively equivalent. In the same way, the hypersurface 6(|, |) = 0 is 
projectively equivalent to the hypersurfacc c(|, g) = 0. Hence, the 
hypersurfaces (1) are projectively equivalent. 

(2) Let a{l, |) = 0 and b{l, |) = 0 be projectively equivalent. 
This means that there is a linear transformation of the variables 5 
into the variables q which carries the form a(|, |) into the quad¬ 
ratic form fe(q, q). But then the ranks and the positive indices of 
the quadratic forms a{l, 1) and b{l, are the same. The proof 
is complete. 
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Remark. In an n-diinensional complex projective space, the hy¬ 
persurfaces a(|, E)= 0 and b(\, |) = 0 are projectively equivalent 
if and only if the quadratic forms a(%, \) and b{l, |) have the same 
rank r. The proof is analogous to the preceding. 

2. Definition. A quadric hypersurface a(|, |)=0 in n-dimen- 
sional projective space is said to be nondegenerate if the quadratic 
form a(l, |) is nonsingular, that is if its rank r = n -f 1. 

Remark. This definition is in accord with the terminology of the 
preceding chapter. 

3. In a real space P„, every nondegenerate hypersurface is pro¬ 
jectively equivalent to one of the hypersurfaces of the type 

I? + ••• — |l+l~ ••• —ln+l=0 

where ^-f 1. Therefore, when n is even in Pn, there 
are + 1 projectively distinct nondegenerate quadric hypersur¬ 
faces, when n is odd, there are + 3) such hypersurfaces. 

4. In two-dimensional projective space (the projective plane), 
there are (in the real case) two projectively distinct nondegenerate 
quadric hypersurfaces, which, incidentally, it would be more natu¬ 
ral in this case to call curves (which is precisely what is done), 
namely: 

(1) the curve 

^■ + |2 + |3 = 0 

which has no points at all in the real plane and is therefore called 
the zero curve; 

(2) the curve 

l? + |2-g3 = 0 (3) 

which has real points and is called an oval curve. 

5. We assume that the projective plane is obtained from the 
ordinary plane by adjoining the ideal line Is = 0. The line la = 0 
does not intersect the curve (3) and in equation (3) we can pass to 

nonhomogeneous coordinates jf=-|t-, w = -|^to get the ellipse 

b3 §3 

x2-f 1/2=1 

6. Now suppose that the straight line I 2 = 0 is an ideal line. 
It intersects the curve (3) in two distinct real points (±X, 0, A,). 
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Discarding them, we now pass to the iionhomogeneous coordinates 
jc = -|^, y — -^ to obtain the hyperbola 

x^-y^=\ 

7. On the projective plane, make tlie following transformation of 
homogeneous coordinates; 

111 = Si. 

^•2 = ~ S2 + S3. 

■Ha = S2 + S3 

Then (3) becomes 

^l-’l 2 % = 0 (3a) 

We assume the line 113 = 0 to be an ideal line. It intersects the 
curve (3a) in the double point (0, k, 0). Discarding it, we put 

= 1 /==-^ to get the hyperbola y = x^. 

8 . Thus, the affinely distinct ellipse, hyperbola and parabola are 
obtained from one and the same oval curve, depending on how it 
is located relative to the straight line which is (or is assumed to 
be) at infinity. 

There is no such distinction in those models of the projective 
plane where the line at infinity is not indicated. For instance, if 
for P 2 we take the bundle O in ?l 3 , then the oval curve is an or¬ 
dinary cone. When passing from the bundle to the plane in 
accordance with Subsection 2, Section 3, we get an ellipse, a hyper¬ 
bola or a parabola, depending on the position of the plane 2 I 2 rela¬ 
tive to the cone at hand (see Figs. 114, 115, 116). 

9. By Subsection 3, there are three distinct quadric nondegene¬ 
rate surfaces in three-dimensional real projective space. They are: 

( 1 ) li+ S 2 - 1-|3 + |4 = 0 , the zero surface (imaginary ellipsoid) 
devoid of real points. 

(2) I2 + I3 — |4 = 0, an oval surface. This type includes the 
ellipsoid, elliptic paraboloid and the hyperboloid of two sheets. The 
analogy is complete with the oval curve considered above. 

(3) I 2 —I 3 —^4 = 0, a toroidal surface. It is easy to verify 
by an appropriate calculation that when passing to affine space a 
toroidal surface turns into a hyperboloid of one sheet or a hyper¬ 
bolic paraboloid. The difference is that a hyperboloid of one sheet 
intersects the ideal plane along an oval curve, while the hyperbolic 
paraboloid intersects the ideal plane along two rectilinear gene¬ 
rators. The torus is a good pictorial model of a toroidal surface. 






Fig. 116 
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Without dwelling on the proof, we confine ourselves to Fig. 117, 
in which to the parallels a, p, y. 6 of foo torus there correspond 
rectilinear generators of the toroidal surface (those are labelled 
with the same letters); to the meridians /, //, III, IV, V of the 
torus correspond oval curves on the toroidal surface that are la¬ 
belled with the same numbers. The infinitely distant oval curve of 
the toroidal surface corresponds to a single meridian ABCD of the 
torus. It is depicted as two copies, although one should imagine 
points having the same labels as being identical. 



Fig. 117 


In Subsection 9, Section 6, Chapter XI, we noted that there are 
two distinct types of real quadric cones in four-dimensional affine 
space. One of them represents an oval surface, the other a toroidal 
surface if for the model of Pz we consider a bundle in 5 I 4 . 

10. Also note that when considering degenerate surfaces it is 
necessary to bear in mind that in projective space there is no 
longer any difference between cylinders and cones. The rectilinear 
generators of a cylinder that are parallel from the affine standpoint 
intersect in a single ideal point. For example, in three-dimensional 
real projective space the equation (3) specifies a real cone. When 
passing to affine space, this cone, depending on the position of the 
ideal plane, turns into one of the following four surfaces that are 
distinct in the affine classification: the cone = 0, an 

elliptic cylinder, a parabolic cylinder, or a hyperbolic cylinder. 
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II. We now return to the two-dimensional case. It is readily 
seen that when passing from the affine plane to the projective 
plane, a quadric curve is supplemented with ideal points of those 
straight lines (and only such lines) that have asymptotic directions 
relative to the curve under consideration (Figs. 118, 119). In 
Fig. 119 (as in Fig. 116) we have a parabola and a model of an 
oval curve in the form of a cone a in the bundle 0. To the para¬ 
bola corresponds a cone a with the exception of the single recti- 
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linear generator a. The axis of the parabola and all straight planes 
%2 parallel to it have a common ideal point A, to which corres¬ 
ponds, in the bundle 0, the straight line a. A is the ideal point of 
the parabola under consideration. 

12. The assertion slated in Subsection 11 will be proved at once 
for the n-dimensional case. 

If we assume tha’points |„+i = 0 to be ideal points, then to find 
ail the ideal points lying on a quadric hypersurface it will suffice 
to put |„+i = 0 in the equation of the hypersurface. Returning to 
equation (6) of Section 1, we obtain, for = 0, 

= 0 (4) 

Except for notation, equation (4) is the same as equation (8), 
Section 9, Chapter XI, which defines the coordinates of vectors 
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having asymptotic directions relative to tlie hypersurface at hand. 
From this it follows that the straight Asymptotic directions are 
precisely those straight lines on which the ideal points of the 
given hypersurface lie. 


§ 7. The intersection of a quadric hypersurface 
and a straight line. Polars 


1 . Given in an n-dimensional projective space P„ a quadric 
hypersurface 

n+l 

^ E ^ = 0 (a) 

and a straight line 


= + (t= 1, .... n+1) 


( 1 ) 


passing through any two points U{u\, .... «„+i), V{v\, .... o„+i) 
of P„. To find the points of intersection of line (1) with hypersur¬ 
face (a), put expressions (1) into (a). We get an equation of the 
form 

/lp2 4.2Bpv + Cv2 = 0 (2) 

where 

rt+l n+I n+l 

E B= 2 OilUiVi, c= y a,jVtVf (3) 

/. /=! i. /=1 /,7=l 


Equation (2) determines the desired points of intersection. 

Let us investigate (2). The solution p == v = 0 yields |i = 
= ... = |„+i = 0 so that no point in P„ corresponds to it. Only 
those solutions need be sought for which p and v do not vanish at 
the same time. The following three cases are possible, depending 
on the coefficients A, B, C. 

(1) AC — 0. We will show that in this case there are two 

distinct points of intersection. 

First suppose that A ^ 0. Then if v = 0, it follows that p = 0. 
So we assume that v = 5 ^ 0. Dividing (2) by v^ we get for the ratio 
p/v a quadratic equation, which has two distinct roots. Denote them 
by Xi and k 2 . We then get two sets of solutions of (2): 

p = A.,v, p = A 2 V (4) 

where v is a free unknown. For v 7 ^= 0, we get from (1) and (4) 
the homogeneous coordinates of two distinct points of intersection 
of the hypersurface (a) and the line (I). 

If under these circumstances atj, Uu are real, but AC — •< 

< 0, then for we have complex conjugate values. In that case, 
even if we are considering a real space r„, we say that the straight 
line ( 1 ) intersects the hypersurface (a) in two complex conjugate 
points. 



460 


PROJECTIVE SPACE 


[CH. Xlt 


Now suppose A — 0. Geometrically, this means that the point U 
lies on the hypersurface (a) (see the first of the equations (3)). To 
the point U corresponds the solution set (p, 0) of (2). From the 
condition AC — =?^= 0 it follows that B#0 and so 

besides U there is another point of intersection whose coordinates 
are determined from (1) provided that 2Sp + Cv = 0. 

(2) AC — B* = 0 but at least one of the coefficients A, B, C 
is nonzero. Arguing as before, we can easily verify that (2) has 
two solution sets that have merged into a single solution set either 
of the form v = Xp, X 0, or of the form p = Xv, X ¥= 0. It yields 
a unique point belonging to the line (1) and to the hyperplane (a). 

In this case however we say that there is a double point of in¬ 
tersection. If such a point is not a singular point of the hypersur¬ 
face (as, for instance, the vertex of a cone), then line (1) is 
tangent to the hypersurface at this point. 

(3) A = B = C = 0. Equation (2) becomes an identity. This 
means that line (1) lies entirely in the hypersurface (a), that is 
to say, it is its rectilinear generator. 

Remark. In contrast to affine space, in projective space any 
quadric hypersurface intersects with any straight line and the con¬ 
cept of asymptotic direction is meaningless. 

2. Let us replace the quadratic form in the left member of the 
equation by the polar bilinear form 

n+l 

E «,/«,!/= 0 (5) 

i, /«! 

regarding «i, ..., tt„+i as the coordinates of an arbitrarily chosen 
fixed point U^Pn, li, ..., En+i being the running coordinates. 

Equation (5) determines a hyperplane, with the exception of the 
special case where the coefficients of all the g,- vanish, that is, 
when 


a,,u, 

+ ••• 

+ ^n + l l‘^n + l —0i 1 

^1 n + l“l 

+ ... 

+ ^rt +1 n + l“ft + l =0 * 


But the homogeneous coordinates Ui .u„+i do not vanish si¬ 

multaneously and so from (6) we get 

det||a,y|| = 0 (7) 

Then, multiplying (6) by «i, ..., u„+i respectively and adding, we 
obtain 

fl+1 

^'^a,jUiUj = 0 


( 8 ) 
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The relations (7) and (8) show (hat (i) may become an identity- 
only when the hypersurface (a) is degenerate and the point (J 
belongs to (a). 

What is more, not only docs equation (8) liold true with respect 
to the point U, but every one of c(|uations (6) holds as well. Such 
a point U is called a singular point of the hypersurface (a). Only 
degenerate hypersurfaces have singular points. A typical instance 
of a singular point is the vertex of a cone. To summarize, tlien, (5) 
becomes an identity when the liypersurface (a) is degenerate, and 
point U belongs to it and is its singular point. In all other cases 
(5) determines some hyperplane. 

Definition. The hyperplane (5) is called the polar of point U with 
respect to the hypersurface (a). 

From the definition it follows directly that if the point U is lo¬ 
cated on the hypersurface (a) and has a polar, then this polar 
passes through t); but if U does not belong to (a), then the polar 
of the point U does not pass through this point. 

If the hyperplane P is the polar of point U, then U is called 
the pole of hyperplane P (with respect to the hypersurface (a) 
under consideration). It is easy to demonstrate that an arbitrary 
hyperplane P has a unique pole with respect to any nondegenerate 
quadric hypersurface. 

Indeed, suppose we have the hyperplane 

-^lll + • • • + ^n+lln+l = 0 

To find the pole of this hyperplane we obtain from (5) the system 


aii«i 

+ .. 

”1“ ^n-H l^n+l ■'^l* ^ 

ain+i«i 

+ .. 

+ O/I + I n + l“n + l = '^n + l 


Since the given quadric hypersurface is nondegenerate, we have 
det II a,; II 0. Under this condition, the system of equations (9) 
has a unique solution («i. «n+i), which is what we sought. 

3. The foregoing definition of a polar as a hyperplane which is 
given for the point U{ui, ..., m„+i) by equation (5) is connected 
with a certain system of projective coordinates. We have to de¬ 
monstrate that this definition has an invariant (geometric) mean¬ 
ing, that is, that the polar of the given point U with respect to the 
given quadric hypersurface does not depend on the choice of the 
system of projective coordinates. 

Suppose we are passing from the old coordinates gj to the new 
coordinates I'i via formulas of type (1), Subsection 4, Section 2. 
Without loss of generality, we can assume in these formulas X=l. 
In the terminology of tensor algebra, formulas of this type define 
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a contravariant transformation law. In particular, the old coordi¬ 
nates «,• of point LJ transform by these formulas to the new coor¬ 
dinates u'i of the same point. Transforming the left member of the 
equation of the given quadric hyperplane, we get the identity 

n+I n+l 

1] aiihli= S 

i. /=•! i. /==! 

as a consequence of which the new coefficients a'li are expressed 
in terms of the old coefficients an by the covariant law. But then 

2 ] aiiUil'i= Yj anUih ( 10 ) 

t. /=■! !■ /=! 

since the complete contraction of a second-order covariant tensor 
with two contravariant tensors is an invariant. From (10) it is 

rt+l n-n 

evident that the equations Y ai/Mi|/ = 0 and Y ai/«i|/ = 0 

i./=i i./=i 

hold simultaneously and, consequently, define one and the same 
plane. This proves the invariance of the definition of the polar, 
that is, the independence of the polar of any choice of projective 
coordinates. Now note that formulas of type (1), Subsection 4 of 
Section 2, may be regarded from another viewpoint, namely as for¬ 
mulas of a projective transformation. Therefore we have also prov¬ 
ed the following theorem. 

Theorem I (projective invariance of a poiar). If under a projec¬ 
tive transformation the hypersurface (a) goes into a hypersurface 
(a'), and point U goes into point U', then the polar of U with 
respect to (a) goes into the polar of U' with respect to {a'). 

4. In Section 5 we defined harmonic sets of four points. For what 
follows we will have to extend this notion to the case where the 
points of one of the two harmonically divided pairs are coincident. 

For the time being we assume that a projective line with a quad¬ 
ruple of points M, N, U, V is obtained by supplementing an affine 
line with the ideal point U. Then {MNUV) = —1 if V is the mid¬ 
point of the line segment MN (see Section 5, Subsection 5). Now 
let point N tend to M and point V remain the fourth harmonic re¬ 
lative to the ordered triad M, N, U. Then V tends to M. 

On this basis we will generally assume that if M = A^, then the 
fourth harmonic point V relative to the ordered triad M, N, U coin¬ 
cides with M and N, and we will then write 

(MNUV) = iUVMN) = -l 

5. Assume U does not belong to the hypersurface (a) and also 
that the straight line a passes through U, 
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By Subsection 1, line a intersects the li^'persurface in two points 
M, N (which are distinct, coincident or complex conjugate). 

Definition. Points U and V on the straight line a are situated 
harmonically relative to the hypcrsurface (a) if {UVMN)= — 1. 

Remark. This definition can also ho used when the space is real 
and the points M, N are complex conjugate points (here we use 
the terminology of Subsection I. Section 7). Using the formulas of 
Subsection 15, Section 2, wc can prove that if U is real in this 
case, then V is also real. 

6. Theorem 2. If point U does not helonp to the hypersurface 
(a), then the polar of U is a locus of all points V such that the 
pairs UV are situated harmonically relative to (a). 

Proof. Let V be the fourth harmonic point for the points Af, N, 
U. The coordinates of an arbitrary point on line a (this point is 
distinct from U) can be represented as 

li — hii-f-Vi ( 11 ) 

(see (1) for V 0, = p/v). The points M and N are determined 

from ( 11 ) for X = Xi, X = A, 2 , where ^i, 7.2 are the roots of the 
quadratic equation 

rt+1 n+1 n +I 

a,,0,0, 4-27, X a,,«iO, 4 7,2 a,,«,u, = 0 (12) 

i. /=! i. /=! l. /=! 

which is obtained by substituting ( 11 ) into (a). 

Suppose that 7, ^ 72 . Then points M and N are distinct and by 
virtue of the choice of point V we have 

(MNUV) = (UVMN) = -^ = — l (13) 

Hence 7, 4 ^2 = 0 and by Vieta’s theorem (sum and product for¬ 
mulas of the roots of an equation) 

n+1 

= 0 (14) 

This equation shows that V belongs to the polar of the point U. 

If 7 , = 72 , then M = N and, by Subsection 4, V = M = N\ 
from ( 11 ) we find that 7 i = 72 = 0 (these equations may also be 
obtained from (12) if we take into account that V lies on the hyper¬ 
surface (a)). Thus we again have 

7,472 = 0 (15) 

whence, as above, we get (14). 

Now let V belong to the polar, that is, (14) holds true. From 
(12), (14) and Vieta’s theorem follows (15). Now there are two 
possibilitie?: 
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either A,i = A ,2 = 0 and then V = M = N due to (11), and 
(UVMN) = —1 by Subsection 4; 

or 7,1 4^ 1.2 and (13) is applicable. 

The proof of Theorem 2 is complete. 

7. Theorem 1 is a geometrically obvious consequence of Theo¬ 
rem 2. To illustrate this fact, consider the case where point U does 
not belong to hypersurface (a). Then in order to construct the 
polar of U it suffices to find n points I'l, ..., Vn in the general 
position and such that all pairs UVi are located harmonically rela- ; 
tive to (a). The hyperplane passing through the points V\, V 2 , ■ ■ ■ j 
..., V„ will be the polar of the point U. Under a projective trans- } 
formation it will go into the polar of the image of point U with » 


Fig. 120 


respect to the image of the hypersurface (ot) because of the pro¬ 
jective invariance of the cross ratio. 

8. Suppose point U does not belong to the hypersurface (a) and 
lies on a certain hyperplane that is assumed to be at infinity. Then 
from Theorem 2 and Subsection 5, Section 5 (with account taken 
of Section 10 of Chapter XI), it follows that the polar P of point U 
is a diametral hyperplane conjugate to the direction of the parallel 
straight lines that intersect at U (Fig. 120). 

9. Let us consider the special case where the hypersurface has 
an equation like 

C||i -j- ... 4-Cn+l|n+l = 0 (16) 

where c,- ^ 0 for all i = 1, ..., n + 1 and for the point U we take 
a point Aj with coordinates 

?/=l, |, = 0 for/^/ (17) 

where / is a fixed number, 1 ^ ^ n -f 1. 

By formula (5) the hypersurface = 0 is the polar of point Aj. 

Recall that the choice of points with coordinates like (17) and 
of the units point 6(1, ..., 1) uniquely determines a system of co¬ 
ordinates in P,i (Section 3, Subsection 9). In the given case, the 
set of points /li.4„+i has the following property that describes 
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the specific position of these points relalive*to the hypersurface 
(16). 

Each one of the points Aj is a pole of the hyperplane passing 

through the remaining points Ai, A ^ i, Aj+\ . A„+i. 

Such a set of points is said to l)C' self-polar. 

It can be demonstrated that this property is not only necessary 
but also sufficient for the equation of a nondegenerate quadric 




Fig. 121 Fig. 122 

hypersurface to assume the form (16), and by an apt choice of the 
units point we can have |Cil = 1 for » = 1, ..., n 1. 

10. We now consider some properties of polars. 

Theorem 3 (equality principle In the theory of polars). If a point 
V is located on the polar of a point U, then the polar of V passes 
through U. 

Proof. Theorem 3 follows from the definition of a polar and the 
symmetric nature of the matrix || Uij || of the coefficients of equa¬ 
tion (a). 

Theorem 4. If point U lies on the hyperplane (a) and has po¬ 
lar P, then every straight line in P that passes through U is tang¬ 
ent to this hypersurface at U and is possibly its rectilinear gene¬ 
rator (Fig. 121). 

Proof. If a straight line of type (1) does not have any common 
points with the hypersurface (a) other than U, then it is a tangent 
by Subsections 1 and 2. It therefore suffices to prove that if on 


16-661 
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line (1) there is, other than U, a point V belonging both to the 
hypersurface (a) and its polar (5), then line (1) lies entirely in 
the hypersurface. Let the points U and V belong to the hypersur¬ 
face (a): 

n +1 n +1 

Z a//«,«/ = 0, z a,/U,t'/ = 0 (18) 

i, i=\ i.i=\ 

and besides let V lie on the polar of U: 

n+ I 

2 ai,UiVj = 0 (19) 

i. /=! 

Using (I), (18), (19) and (a), we find that 

n+l n+I 

^ ll auhlj = ^ «// (t*«i + vyi) (p«/ + VU/) 

n+I fi+l n+l 

= Z «//«,«/+ 2pv 2 11 aiiV,v, = 0 

l. /=! /. /=! /, /=! 

for arbitrary p, v, that is, the straight line UV is a rectilinear ge¬ 
nerator of the hypersurface (a). The proof of Theorem 4 is com¬ 
plete. 

Corollary. If a point on a projective plane belongs to an oval 
curve, then the polar of this point is tangent to the oval curve. 

11. Let us consider the following problem as an appendix to the 
results just obtained. 

Given on a two-dimensional Euclidean plane an ellipse and a 
point U exterior to the ellipse. It is required to construct tangents 
to the ellipse that will pass through U. 

Construction. Through U draw any two straight lines, each of 
which intersects the ellipse in two distinct (real) points A, B and 
C, D, respectively (Fig. 122). Let Q be the point of intersection of 
the lines AD and BC, and let P be the point of intersection of AC 
and BD-, K and L are points of intersection of the ellipse and the 
straight line PQ. Then UK and UL are the desired tangents. 

The proof is readily carried out with the aid of Theorem 4 oi 
Subsection 9, Section 5, Theorems 2 and 3 of this section, and the 
corollary stated in Subsection 10. 
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PROOF OF THE THEOREM 
ON THE CLASSIFICATION 
OF LINEAR QUANTITIES 


I. The assertion expressed in Chapter VI at the end of Section 2 and in 
Subsection 8 of Section 3 and left without proof can be stated in the form of 
a theorem which follows. 

THEOREM. Let G„ be the group of all real nonsingular n X n matrices and 


f(P) a real numerical function specified on G„. Let u ^ f{P) he a homomor¬ 
phism of the group Gn into the group {under multiplication) of all real numbers 
without zero, that is 

/(PP')=./(P)/(P') 

(1) 

for arbitrary P, P' e Gn. Then 


either f (P) = | del P |°, a constant. 

(2) 

or f (P) = ± 1 det P 1“ 

(4) 


where the plus sign corresponds to the case det P > 0 and the minus sign to the 
case det P < 0. 

REMARK I. Both functions (2) and (3) satisfy the condition (1) due to 
the familiar property of the determinant of a product of matrices. And so the 
essence of the theorem lies in the guarantee that there are no functions .satis¬ 
fying condition (1) except (2) and (3). 

REMARK 2. In the statement of the theorem given above we have dropped 
the function-theoretic conditions. We will prove the theorem assuming that 
the function f{P) is continuous on G„. 

REMARK 3. From now on we can forget that u = f{P) is a homomorphism 
of Gn into the group (under multiplication) of the real numbers without zero 
and only require the observance of (I). The point is that if we exclude the 
unintersting case of the identity f(P) s 0, then it follows of itself from (I) 
that /(P) 0 for all P e G,.. 

Indeed, let us assume that there is at least one matrix Po s G„ for which 
f(Po) = 0. Then for any P e= G„ we have f (P) (PP“')/(Po) =0. 

From this we obtain an important corollary to condition (1): 

(4) 

where E is the unit matrix. Equation (4) follows from the relation f(P) — 
= f(PE) = f(P)f(E) since f{P) 0. 

From (4) we get 

MF-')»»{/(P))-' (6) 

since /(p-')/(P) = /(£)=!. 


16 * 
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2. PROOF. First for n = 1, for which we have 

i{xu) = i(x)f{y) (6) 

where x,y e R. R stands for the real line with zero deleted. 

Note several .simple corollaries of (6). 

(I) If X > 0, then f{x) > 0. True enough, for 

I (.V) = f (Vx) f Wx) = {/ > 0 


(2) If there is a number Xo < 0 for which f(Xo) < 0. then f(x) < 0 for any 
X < 0 (x e /?). Indeed, if x < 0, then f(x)f(xo) = f(xxo) > 0. 

Under this same assumption, we have f(x) = —/(Ixj) for x < 0. Indeed, 
due to (5) we find {/((x|)}-' =/(|x|-'), whence f(x) ;Mjx|) =/(—!) = —1, 
since ^(-l)/(—1) = /(I) = I and f(—l) < 0. 

(3) If there is a number Xo < 0 for which f(Xo) > 0, then f{x) >0 for all 
X ^ R, and always f{x) = /(|x|). 

Because of properties (I), (2), (3) the matter reduces to considering the se¬ 
miaxis X > 0. 

(4) For every rational number r > 0 and for arbitrary x > 0, 

f(x^) = {f(x)}^ (7) 


True because if r = n (n natural), then by (6) 

f (x") = Hxx ... x)=^f{x)f(x) ... f (x) = (/ (x))" (8) 


If r = 1/ni (m natural), then due to (8) {/(x'/™)}”' = f{x), whence 


f(x'"") = {/(x)}'"" (9) 

From (8) and (9) we get 


f (.«>»"«) = {/ (x)}""” 

which proves (7). 

Now let us take a fixed number a, a > 0, a \. Put b = f (a). Under our 
hypothesis 6 > 0. We can write b = a'’, cr constant. Let x be any positive 
number. We can also write x = a* and assume that k is the limit of a se¬ 
quence of rational numbers r„: 


By (7) 


k = lim 

rt-»+oo 

Ka''‘) = {/ (a)}"'* 


whence and due to the continuity of /(x) we have 
f{x) — f (a*) = / (lim = lim f (a^") = lim {/ (a)}^™ 

= {f (a)}‘ = 6* = = (a*)" = x". 


Thus the theorem is proved for n = 1. Namely, either f{x) = lx]” for every 
xs/? or f{x) =±1x1®, where the plus and minus correspond to the cases 
X > 0 and x < 0 (x e /?). 

3. We now take up the group Gn for any n. Assume that there is a nume¬ 
rical function f(P), Pe G„, that satisfies the condition (1). 

First of all note that this function assumes the same value on equivalent 
matrices. Thus, if 

Pj = /tP.A-' 

then by (1) and (5) 


f(P2) = f(A)f(P,)/(A-') = /(P,) 
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We can therefore assume that the function /(/U at hand is defined on the set 
of all nonsingular linear transformations of an*/i-dimcnsional real linear space 
Ln- Here the symbol P is to he understood either as the designation of- a 14* 
near transformation or as that of its matrix (relative to any basis). 

We introduce a Euclidean metric into the space L‘„. Then for any P 
have 

P Jli (10) 


where J is an isometric linear transforniiition and fl is a self-adjoint transfor¬ 
mation (see Chapter IX). We can say that det J>Q (and lienee, (let 7 = 
= + 1 )- 

Consider the subgroup /(/) of the group G„ made up of matrices of type 


/«) = 


1 


0 


’ (k) 



1 


0 


1 


. (*) 


cos t — sin / 
sin t cos / 


( 11 ) 


assuming that the submatrix (k) occupies a fixed place on the diagonal. Let us 
verify that /(/(O) = • lo*" arbitrary t. True enough, for the function u = 
= /(/(O) maps the interval 0 < f ^ 2n onto an interval a ^ u ^ b, [a, b) cr 

cr k, ami since f(/(0)) = f(E) = 1, it follows that 0 < o < 1 < b (take note 

of Remark 3 of Subsection 1), whence if there is a value of t for which 
f {i(t)) ¥= 1, then a < b. Besides, since j{t) is a subgroup, then every element 
has an inverse. Therefore from (5) it follows that a < 1 < b. Let / be an ele¬ 
ment /(/) such that /(/) = b. We have f{p) = b^ > b, which is impossible 
since f belongs to /(/) and b is the maximum value of /(/(O)- We divide the 
remainder of the proof into two parts. 

0) By Chapter IX (with account taken of the fact that det J = -fl), mat¬ 
rix J can be represented as J — jt, ..., /«. where /i./, are matrices of 

type (11) with different positions of the submatrix (k). From this and on the 

basis of the foregoing, 

/(/) = ^(/,) ... /( 4 ) = 1 ( 12 ) 

(2) Set 



I 

0 

bk(x) = 

X 



0 

1 


where x is in the bth place on the diagonal. For a given x and for distinct k, 
expression (13) yields equivalent matrices. Hence the function 

<f{>‘) = f{b^(x)) 
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(Joe* not depend on k. Also note that b,{x)b,(y) = bi{xy), whence and on 
the I)a 3 is of (I) we find (f{xy) = <p(x)<p(y), where x,y^R. Therefore and on 
the basis of Subsection 2 we have; either (pfjr) = |jic|“, o constant, or (p(jit) = 
» ilxl”, where the plus and minus signs correspond to the cases jr > 0 and 
* < 0, respectively. 

By Chapter IX, a linear transformation B relative to some basis is represen¬ 
ted by a diagonal matrix; let Xj.X„ be numbers in the diagonal of this 

matrix. 

We then have B = t>i(Xi) ... 6n(Xn) whence either 

MB) = (p(X,) ... (p(X„) =|X, ... X„r (14) 

f(B) = (p(X,) ... q,(X„)=±|X, ... X„r (15) 

We have (15) if (p{x) = ±|x)°. Then in (15) the minus sign holds if 

among the numbers X.X„ there is an odd number of negative numbers, 

that is, if X| ... X,. < 0. But ^ ... Xn = det B == det P. From this and from 
(10), (12), (14), (15) we get the assertion of the theorem. 
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1. Let Z. be a complex linear space. Besides the linear functions that have 
been studied (Section 1, Chapter IV), in L we can consider so-called linear 
functions of the second kind defined by the following axioms: 

(1) b{x + I/) = b(x) -f b(y) for any vectors x, u, in L\ 

(2) b(a,x) = ab{x) for any vector x in Z. and any complex number a. 

In contrast to these functions, the ordinary linear functions are called li¬ 
near functions of the first kind. 

There is no necessity to construct a separate theory of linear functions of 
the seco nd kind, for if a{x) is a linear function of the first kind, then b{x) = 
= a(x) is a linear function of the second kind; if b{x) is a linear function 
of the second kind, then a(x) = b{x) is a linear function of the first kind. Ho¬ 
wever, the existence of two types of linear functions (forms) impiies the exis¬ 
tence of distinct types of multilinear forms, namely, such as are linear of the 
first kind in a certain set of its arguments and linear ol the second kind in 
the remaining arguments In particular we have four types of bilinear forms: 

(1) linear of the first kind in each of the arguments (they were investigated 
in Chapter IV); 

(2) linear of the first kind in the first argument and of the second kind in 
the second argument; 

(3) linear of the second kind in the first argument and of the first kind in 
the second argument; 

(4) linear of the second kind in both arguments. 

It is readily seen that the fourth type is obtained from the first by complex 
corrugation, and the third is obtained from the second. 

Our subject here wili be a certain class of bilinear forms of the second type 
that have important applications (in particular in the theory of functions of 
complex variables and in quantum physics) and also related questions of geo¬ 
metry. 

DEFINITION. A bilinea r form of the second type a{x,y) is said to be Her- 
mitian if a(y,x) = a(x,y) ior any vectors .v, y in L (the bar over the complex 
number indicates, as usual, conjugation). 

Thus, the function a(x,y) is called a bilinear Hermitian form if 

(1) a(Xi-f X2,r/) =o(Xi,(/)-f a(x2,(/) 

for arbitrary vectors Xi, Xi, y in Z.; 

(2) a(ax,y) =aa{x,y) for arbitrary vectors x, y in L and for any comp¬ 
lex number a; _ 

(3) a(y,x) = a(x,y) for arbitrary vectors x, y in L. 
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EXAMPLE. Let be a linear space of continuous complex-valued functions 
specified on the interval /i ^ f ^ of the real axis. Set 

I* 

a(x, y)= ^ x(t)y{t)dt 

t\ 


Then the function a(x,y) is a bilinear Hermitian form. 


2. Let o(x,y) be a biiinear Herminatian form. Then the function f(x) = 
= a(x,x) is a quadratic Hermitian form, or simpiy a Hermitian form. 

The original bilinear Hermitian form a(x,y) is said to be the polar of the 
Hermitian form f{x) = a{x,x). 

From the definition it follows dire ctly that the (quadratic) Hermitian form 
assumes only real values: f{x) = f(x). 

A Hermitian form f{x) is said to be nonnegative (nonpositive) if f(x) ^ 0 
(f(x) ^ 0) for any x in L, and positive definite (negative definite) if f (x) > 0 
(f (x) < 0) for any x Q. 

THEOREM 1. A Hermitian form f(x) uniquely defines the polar of the bi¬ 
linear form a(x,y). 

PROOF. We have to express an unknown bilinear Hermitian form a(x,y) in 
terms of a given function f(x) using the fart that a(x,x) =f(x). We have 

f(x y) = a(x -{■ y, x-{-y) = a(x, x)-\-a(y. y) + a(x, y)-{-a(y, x) 

= / (x) -f / (jl) -f 2 Re a (x, y) 

whence 

Rea(x, y) = -jlf(x-(-y)-f(x)-f(y)] (1) 

Furthemore 

f{x + iy) = a(x + iy, x-\- ly) = a (x, x) + a (iy. iy) + a (x, iy) -f a (ly. x) 


= a(x, x) + a (y. y) — la (x, y) + ia (x, y) = t(x)+f (y) + 2 Im a (x, y) 


whence 

Im a (x, y) =. y (f (x + iy) -f(x) — f(y)] (2) 


From (I) and (2) we get 

aix. y)^j{f(x-t-y) + if (x + /y) - (1+ i) [f (x) + / (it)]) (3) 


which proves the theorem. 

REMARK. Formula (3) may be given a more symmetric notation by repla¬ 
cing y by (—y) and subtracting the resulting equation from (3) termwise. 
We then nave 

« y) = j{f(x + y)- /(x - y)-\ri [/(x + iy) - f(x - iy)]] (4) 
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Note the difference between the formulas (3), ^t) and the corresponding for¬ 
mula (1) of Section 4, Chapter IV, and the resemblance between the latter 
formula and formula (1) of this section. 

3. Now suppose that the complc.\ space /, is n-dimensional and that «i. 

Cn is a basis in it. Expanding the vectors .v, »/ with respect to this basis 
(^x = ^ x^Bj, y = and using the (h'finitioii ol the bilinear Hermitian 

form a(x,y), we get the coordinate representation: 

where 

= (6) 

At the same time, 

f (•«) = a (•*. Jt) = X “;**^** (7) 

The matrix A = ilaj»|| is called the matrix of the Hermitian form (7) and 
its polar bilinear form (5) in the given basis. In matrix notation, formula (5) 
looks like this: 

a{x, i/) = x'Ay (5a) 

Here, x and y are column matrices (/t X 1 matrices). As usual, the star deno¬ 
tes the operation of transposition of the matrix. The bar on the matrix y means 
that all the elements are to be replaced by comp lex-conjugate numbers. 

The rank of a Hermitian form is the rank of its matrix. A Hermitian form is 
said to be nonsingular if its rank is equal to n. The invariance of rank rela¬ 
tive to choice of basis will be proved in the next subsection. 

We note in passing that the complex n X » matrix A = ||aj(,|| is said to be 
Hermitian if it satisfies the condition = Oj* or, in matrix notation. 

A* = A (8) 

From (8) it follows that the determinant of any Hermitian matrix is real: 

(det A) = del J= del A* = det A (9) 

From (8) it also follows that matrix A is symmetric if and only if it is Her¬ 
mitian and real. 

4. Let us determine the law of transformation of coefficients of a Hermitian 
form under a transformation of the basis. We pass from the original basis 
Bu Bn to the new basis: 

B,. = X P\'e, (10) 

From (6) and (10) we find 

= « (E ^/'«/> S = Z «(«/. e,) - S a,^P',,Pl, 

Thus _ 

= Z 

In matrix notation (II) becomes 

A' = PAP’ (12) 

From (12) follows directly t he inva riance of the rank of a Hermitian form, 
since det P ^ 0 and det P — (dot P) =/= 0. 
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5. We now show (hat the law of transformation of coefficients given by for¬ 
mula (II) guarantees the invariance of the bilinear Hermitian form. 

Let the function a(x,y), relative to a basis e\ . Cn, be given by for¬ 

mula (5); when passing to a new basis (10) let its coefficients transform by 
formula (II). We now verify that the numerical value of a(x,y) on an ar¬ 
bitrary pair of vectors Jt, y is preserved. It will be convenient to use the mat¬ 
rix formulas (5a) and (12) in addition to the coordinate notation. Let 

a (x, y) = X v'^' = 

By Section 5, Chapter II, x = PV, y = P'y' and so 

a (X, y) = x'Ay = (PV)* A (PY) = {x')' PAP'y’ = (x')* = a’ (x, y) 

which is what we set out to prove. 

Also note that for the function a(x,y) (5) ensures linearity of the first kind 
in the first argument and linearity of the second kind in the second argument, 
and if Oki = om. then a(y,x) — a(x,y) Thus, formula (5), with the foregoing 
circumstances taken into account, yields the general aspect of bilinear Hermi¬ 
tian forms in n-dimensional complex space. 

6. Suppose, relative to a certain basis, the matrix of the Hermitian form 
/(x) is such that oj/i = 0 when j # k. Then we say that the Hermitian form 
f(x) in this basis is of canonical form: 

nx) = a„x>F-f a22xV-l- ... +a„nxV (13) 

Note that all the coefficients Ojj are real, since, generally, am = an (in any 
basis). 

Repeating the arguments and computations of Sections 5 to 9, Chapter IV, 
almost word for word, we establish the following. 

(1) A Hermitian form can be brought to canonical form by a nonsingular 
transformation of the variables, for instance, via the Lagrange method. 

REMARK. Bear in mind that here we have to do with the quantities x^xl 
instead of the squares (x>)^: we refer the reader to the example in Subsection 
7 below. 

(2) If all the principal minors of the matrix of a Hermitian form are other 
than zero, then reduction to canonical form can be carried out via the Jacobi 
method. 

REMARK. The principal minors of a Hermitian matrix are always real be¬ 
cause of (9), since each is itself a determinant of some Hermitian matrix. 

(3) For each Hermitian form f(x) the number of positive, negative and zero 
coefficients in its canonical form (13) is independent of the choice of the 
basis that gives the form its canonical form (the law of inertia of Hermitian 
forms). 

(4) A Hermitian form is positive definite if and only if all the principal mi¬ 
nors of its matrix are positive (Sylvester’s criterion for Hermitian forms). 

7. EXAMPLE. Let us reduce the Hermitian form 

f = (\+3i)x'7‘ + {\-3i)x‘J' (14) 

to canonical form via the Lagrange method. Since On = an — 0, we first apply 
the transformation 

x'=z'-|-2^ f 

jtJ =21-22 J 


( 15 ) 
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in order to obtain at least a nonzero cocfflcicntHjn the principal diagonal of the 
matrix of the form /. Substituting (15) into (14) and regrouping, we get 

f = 2z'z' — lUz'z’‘ + 6/2^2' — 2z^z^ (16) 


We are now in a situation that corresponds (o the first case of Seciton 5. Chap¬ 
ter IV. Set 


y' = 2z' + CUz^ ^ 
y^== 2» j 


(17) 


Then the Hermitian formg = /—contain y' and therefore 
depends solely on y^, namely, 

8 = l--^y'7=- 202V = - 20</V 


because of (16) and (17). Yet 

f =4-!'V-20'/V (18) 

Now, expressing 2 ' and 2 * from (15) and substituting them into (17), we get 
?,' = (l-f3,)x‘+(l-3i)x2. ■) 



Thus, the Hermitian form (14) is reduced to canonical form (18) by a nonsin¬ 
gular linear transformation of the variables (19). The corresponding transfor¬ 
mation of a basis in two-dimensional complex space can be written out in ac¬ 
cordance with Section 5 of Chapter II. 

8. Let a Her mitian form f(x) be reduced to the canonical form (13). Then, 
putting = -\l\ flyy I x> if an # 0. and V = x> if an = 0, we reduce / to nor¬ 
mal form via a nonsingular transformation of the variables. Dropping the tilde 
we can write it thus: 

/(x) = ^eyxV = X;®/l^'l' (20) 


where the ej are equal to I or zero. 

9. Let there be given in a linear space L (which may be complex but not 
necessarily finite-dimensional) a positive definite Hermitian form g(x). Let us 
consider the polar bilinear Hermitian form a{x,y) and call its value on an ar¬ 
bitrary pair of vectors x,y their scalar product: 

{x,y) = a(x.y) (21) 

Accordingly, we define the norm of a vector ||x|| = V(-y. x) = V& (Jf) and also 
introduce the concept of orthogonality by regarding the vectors x and y as or¬ 
thogonal if and only if their scalar product is zero: (x, y) =0. 

DEFINITION. A space L with a specified scalar product (21) is said to be 
unitary and the Hermitian form g(x) is called the metric form of that space. 
We also say that a unitary metric is introduced in the space L. 

The norm of an arbitrary vector x in unitary space is a nonnegative real 
number (||xl| ^ 0) which, due to the positive definiteness of the metric form 
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g(x) is equal to zero if and only if x is a zero vector. Here, ||ax|l = <**) = 

= V®5S (■'•') = I ® I • 11*11 lor any scalar a and any vector x. In Subsection 12 
below we will see that the triangle inequality al.so holds. 

By Theorem 1, the specification of the metric form g(x) uniquely defines the 
scalar product of any pair of vectors. The properties of a scalar product differ 
somewhat from the real case. Indeed, we have 

(*i + * 2 . y) = (xi, y) + {x 2 , y); 

( X , y\ + 1/2) = (*. y\) + (*. 1/2): 

(ax, g) = o(x, y) 

but (x,ay) = a(x,y) and (y.x) = (x,y). The numerical values of a scalar 
product are, generally, complex. 

The orthogonality of a vector to a subspace, the orthogonality of subspaces, 
and the orthogonal complement arc defined in unitary space by analogy with 
Euclidean space. 




Fig. 123 Fig. 124 

REMARK. Occasionally, the requirement of positive definiteness of the met¬ 
ric Hermitian form g(x) is dropped. We then have a class of spaces that is 
more general than the class of unitary spaces. Such spaces are called spaces 
with a Hermitian metric. We shall not discuss them. 

10. EXAMPLE. A ONE-DIMENSIONAL UNITARY SPACE. In order to con¬ 
struct this space wc have to specify a positive definite Hermitian form g(x) in 
a one-dimensional linear complex space L\. For L\ we take a coordinate space 
which may be imagined as the ordinary plane of a complex variable, the ele¬ 
ments X, y, ... being complex numbers. The matrix of the desired Hermi^an 
form g contains the sole element an, which must be a real number (a,i — an). 
By Sylvester’s criterion, an >0. Suppose we choose an =a^ a > 0. We take 
the positive definite Hermitian form g(x) = a^xx for the metric form of Li- 
Then ||x|| = a|x| so that the elements of /,| with the given norm ||x|| fill the 

circle |x|=-^||x|| on the complex plane (Fig. 123). In the resulting unitary 

space, a scalar product is given by 

(x, g) = a^xg (22) 

To determine its geometric meaning, we put x = u -|- 10 , g = | -|- /q. Then (22) 
becomes 

(x, g) a^ |(«| -f tiq) — j “ “ 11 = (t — io) 


(23) 
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Here, t denotes the scalar product of the raiiius vectors ox and oy viewed as 
vectors of a Euclidean plane, and o is the oriented area of a parallelogram 
constructed on these vectors (Fig. 124). From (23) it is evident that (x,y) =0 
if and only if t = a = 0, that is when jt = 0 or y = 0. It is therefore quite 
evident that there cannot be two nonzero orthogonal vectors in a one-dimen¬ 
sional unitary space. At the same time, this space can be naturally depicted as 
a two-dimensional Euclidean plane, the norms of the elements being proportio¬ 
nal or, with an appropriate choice of scale (for a = I), equal to the lengths 
of the corresponding vectors of tiie Euclidean plane. 

II. We now prove that in every hnite-dimcnsional unitary space there is an 
orthonormal basis, that is, a basis such that all its vectors are pairwise ortho¬ 
gonal and have unit norms. Sucii, in actuality, is any basis in wiiich the met¬ 
ric form g(x) is of normal form; 

g(*) = ||a:||^ = .ic'x'+ ... + xV'= ;icM‘ (24) 


(see (20) for ej = +1, / = 1, .... n). 

As in the case of a Euclidean space, an orthonormal basis is not defined uni¬ 
quely in a unitary space. Take the example of the preceding subsection; for the 
(sole) vector constituting the orthonormal basis we can take (when a= 1) 
any complex number with unit modulus (ci = cos ip 4- i sin to). 

Because of (24), the scalar product in any orthonormai basis of a unitary 
space is given by 

(x, {i) = A:y+ ... +x"y'‘ (25) 

Using orthonormal bases, it is easy to verify (as was done in Section 5 of 
Chapter VllI) that unitary spaces of the same dimension are metrically iso¬ 
morphic. 

12. To get a better picture of the geometry of a unitary space in the n-di- 
mensional case, we compare an n-dimensional complex space Cn with a real 
space Ejn of dimension double C„ (see Section II of Chapter I). Let C„ be a 

unitary space and let ei. e„ be an orthonormal basis in it. Decompose an 

arbitrary vector x of C„ in terms of this basis and separate the real and imagi¬ 
nary parts of each component (coordinate) x*; x* = u* + (u*. We can regard 
Cn as a coordinate space and write down its elements as 

X = («' -+- /o', -f iv^, u" + iv") 

Besides the basis ei, Cn, consider the vectors /» = («*, ft = 1, .... n. 
Then 

x = a'e|-f ... +«"«„+ »'/,+ ... +»'’/„ 

Now write down the vector x as 

x = {u', o', ti*. v"] 

viewing it as an element of a real coordinate space Em- Such precisely was 
the comparison dealt with in Section 11 of Chapter 1 (with somewhat different 
notation). By Section II, Chapter 1. the spaces Cn and Ei„ are isomorphic 
from the standpoint of the operations of addition and multiplication by real 

factors. We now assume that E^n is Euclidean and that the basis ei, /i, 62 , k . 

Cn, In is orthonormal in it; compare a scalar product in the spaces Cn and 
fjn. Besides x, take another arbitrary vector y = ^ ^ = I* + '’I* 

and again consider it as an element of the space E 2 n'- 

lf = {6'> ’l'* l^> 'H'** •••> ’l"} ^ ^ 2 n 
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Denote a scalar product in £211 by (x, (/) e. We then have 

(X. y)E = «'!' + p't)' + + ti'n- + ... + + P-Ti" (26) 


On the other hand, express as follows (via (25)) the scalar product (x,y)c of 
the same vectors x, y (this time, however, they will be regarded as elements 
of the unitary space C„); 


(*. y)c = Z •***'*' = Z 

=xw+pV)-<z i :1 


(27) 


Comparing (26) and (27), we see that (x, //)e = Re (x,y)c, whence it is clear 
that vectors which are orthogonal in the unitary space C„ are also orthogonal 
in the Euclidean space £3,, The converse is nol true: vectors orthogonal in £ 2 ™ 
may not be orthogonal in C„. Such for example is the case if Re(x, ^)c =0 
but lm(x, j/)c = 0. 

It is important to observe that despite the difference between the scalar pro¬ 
ducts (x,y)c and (x, (/)e, the norm ||x||c of the vector x in the unitary space 
C„ coincides with its Euclidean norm IIxIIe: 

Me = Z + Z = "''IliS ( 28 ) 


To summarize, then, the spaces C„ and £ 2 n are isometric, but the scalar 
product in the unitary space C,, carries more information concerning the fac¬ 
tors because of the imaginary part. One should therefore expect that the geo¬ 
metric properties of a unitary space will prove to be more fragile than those 
of the Euclidean space isometric to it, that is, that the group preserving their 
linear transformations is narrower. In Subsection 16 it will become evident 
that this is precisely the case. 

Incidentally, from (28) it follows that in a unitary space we have the trian¬ 
gle inequality 

l|x + yllc < llxllc + lll/llc (29) 


since (29) definitely holds in Euclidean space. 

Further observe that the even-dimensional Euclidean space E 2 „ may be con¬ 
verted in the following manner into a unitary space Cn of half the dimension 

of £ 21 . and isometric to it. In E 3 „ take an orthonormal basis Ci .e„, 

Cn+i. e 2 n and, taking half the vectors of this basis, say the vectors e .. 

e„, let us define for each of them the operation of multiplication by the 
imaginary unit, setting 

/ei=en + i> •••> i^n—^2n 


Then it is easy to see that for any vector x of £ 2 n its product into any com¬ 
plex number will be defined and in such a way that £ 2 n becomes a complex 
linear space of dimension n. If, as before, we assume the vectors Ci, ..., 
to be orthonormal, then (24) will define the norm of any vector, which means 
that the scalar product (25) will also be defined. We will then have the unitary 
space C„ considered above. (We obtain complete equality with the notation 
introduced at the beginning of this subsection by setting /* = ie* = e„+/,, 
* = I. n.) 

Thus, up to an isomorphism, the unitary space C„ is precisely the real Euc¬ 
lidean space £211 equipped with certain supplementary properties. 

13. Let us now take up the study of the most important classes of linear 
transformations in unitary spaces. 

Given, in a unitary space L, the linear transformation A. 
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DEFINITION. A linear transformation A iatsaid to be the conjugate of A 
if (Ax, z) = (x, Az) for any vectors *, y in L. 

• 

The existence and uniqueness ol liie conjugate transformation A may be pro¬ 
ved as in the case of a EHciidean space (see pp. 308-,309) by first establishing 
the existence and uniqueness of the reciprocal basis (as is done on pp. 289- 
290). We may also proceed soinewliat (lilTcrently by confining ourselves to a 
consideration of the transformations in an orthonormal basis, in which case the 
reciprocal basis coincides with the given basis and the reasoning is somewhat 
simplified. 

Let A be the matrix of the given transformation y ■= Ax in an orthonormal 
basis. Then, by (25), the scalar product (y,z) may be written in matrix nota¬ 
tion thus: 

(y, z) = z'Ax (30) 

where x and z are column matrices (n X I matrices). Relative to the same ba¬ 
sis let us consider the transformation B with the matrix A* obtained from mat¬ 
rix A via a transposition and a replacement ol its elements by complex con¬ 
jugate numbers. We will show that B = A. As in (30), we compute the scalar 
product: 

(x, Bz) = ('B^=\x'A'z]=(\x'A'z] Y = {z'Ax\ =z'Ax = (y. z) 

Thus, in an orthonormal basis the matrix of the conjugate transformation is 
given by 

A = A' (31) 

From (31) it follows that the transformation A is conjugate to the transforma- 

« 

tion A. 

14. DEFINITION. A linear transformation A in a unitary space is said to 
be normal if 

AA = AA (32) 

REMARK. In Euclidean space the notion of a normal transformation is also 
introduced by means of formula (32). 

THEOREM 2. A linear transformation A in n-dimensional unitary space is 
normal if and only if there exists an orthonormal basis made up of the eigen¬ 
vectors of A. 

The proof oi Theorem 2 is given at the end of this subsection. Let us first 
establish two lemmas. 

LEMMA 1. The commutative linear transformations A and B (AB = BA), 
in an n-dimensional complex linear space L always have a common eigenvec¬ 
tor. 

RROOF OF LEMMA 1 Since L is a complex space, the transformation A 
has in it an eigenvector x(Ax = Xx. x y= 0). Because of the commutativity of A 
and B, every nonzero vector of type 

X, Bx. B‘x.B*x (33) 

is an eigenvector of A. Indeed, 

AB*x = B*Ax = B*Xx = XB*x (34) 

In the sequence (33) let the first p vectors bo linearly independent, and let 
the (p-|-l)tn vector B^x be linearly expressible in terms of them. Then the 
linear hull L = L(x, Bx . Be-'x) is an invariant subspace of transforms- 
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tion B. The transformation B has an eigenvector y in L By (34) y is also an 
eigenvector of A. 

LEMMA 2. Ij in a unitary space L a subspace L' is invariant under a linear 
transformation A, then its orthogonal complement L" is invariant under the 

conjugate transformation A. 

PROOF OF LEMMA 2. Let xsL', y^L". Then Ax ^ L' and (Ax, y) = 
= 0. But (Ax,y) = (x, Ay) and so the vector Ay is orthogonal to the vector 
X. Since x is arbitrary in L', it follows that Ay e L". 

PROOF OF THEOREM 2. Let A and A be commutative. Then by Lemma 1 
they have a common eigenvector X|. Its linear hull Li = L(x,) is invariant un- 

• f 

der A and A. Hence, by Lemma 2, the orthogonal complement L, of subspace 

* f 

L\ is also invariant under A and A. By Lemma 1 there will be in Z-| a com- 

mon eigenvector xj of transformations A and A, which is obviously orthogonal 
to Xi. We then consider the linear hull Ls = L(xi,X 2 ) and its orthogonal com¬ 
plement L", in which we find a common eigenvector X 3 , and so forth. Conti¬ 
nuing this process we obtain a set of n pairwise orthogonal common eigenvec¬ 
tors X, . x„ of A and A. Normalizing these vectors, we obtain the desired 

basis. 

REMARK. The fact that a set of n pairwise orthogonal nonzero vectors con¬ 
stitutes a basis in n-dimcnsional unitary space is demonstrated by the argu¬ 
ments given on page 258 for the real case. 

Now let it be given that the transformation A has an orthonormal basis con¬ 
sisting of eigenvectors. Then by Subsection 5, Section 8 , Chapter VII, the matrix 

of A in this basis is diagonal. By (31) the matrix of A is also diagonal. But 
diagonal matrices are always commutative, and so also are the transformations 

A and A. The proof is complete. 

15. DEFINITION. A linear transformation A in unitary space is said to be 
self-adjoint or Hermitian if A = A'. 

From the definition it follows directly that self-adjoint transformations are 
a special case of normal transformations. 

THEOREM 3. A normal transformation A in n-dimensional unitary space is 
self-adjoint if and only if all its eigenvalues are real. 

PROOF. Theorem 3 is obvious: it suffices to write A in an orthonormal ba¬ 
sis made up of eigenvectors and use (31). 

REMARK 1. This theorem shows that a self-adjoint transformation operates 
in n-dimensional unitary space in the same way as in Euclidean space: it con¬ 
stitutes a stretching with real coefficients along the n mutually orthogonal di¬ 
rections. 

REMARK 2 . From (31) it is evident that A is self-adioint if and only if 
its matrix in an arbitrary orthonormal basis is Hermitian (A* = A). 

16. DEFINITION. A nonsingular linear transformation in unitary space is 
said to he unitary if A = A“'. 

Unitary transformations are also a special case of normal transformations 
since A and A-' are clearly commutative (AA-'= A-'A = £). Arguing pre¬ 
cisely as on pp. ,323-.329, we can establish that in n-dimensional unitary space 
only those linear transformations are unitary that preserve the vector norm and 
scalar product, which is to say, isometric transformations. 

From this it follows immediately that unitary transformations constitute a 
group, called the unitary group (of n-dimensional space). 
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An isometric transformation in Euclidean sq^ice may not have a single eigen¬ 
vector. The situation in unitary space is different: by Theorem 2, every unitary 
transformation has an orthonormal basis made up of eigenvectors. 

THEOREM 4. A linear irans[orrnalioit A in n-ditnensional unilary space is 
unitary if and only if Ihere is an nrllinnonnal basis of Us eigenvectors and all 
eigenvectors are equal to unity in absolute value. 

PROOF. The existence of an orthunonnal basis . .. of eigenvectors 

of the transformation A follows from Theorem 2. If all the eigenvalues Xj are 
equal to unity in absolute value |Xj| == I), then for an arbitrary 

vector * = ^ x^ej we have Ax = ^ x^ Aci = ^ x^K^ej, II At|| ' = ^ | x^Xj P = 
— ^ I = l|x||^. Thus, A is isometric and therefore unitary. If It is given 
that A is unitary, then AA = £'. And so for an arbitrary eigenvector y we have 
(y. y) = (AAy, y) = (Ay, Ay) = (ky, Xy) = XX (y, y) 

Hence XX = |XP = 1 and Theorem 4 is proved. 

17. DEFINITION. A nonsingular (compex) rtX« matrix U is said to be 
unitary if it satisfies the condition 

U~'=U' (35) 

Condition (35) may be rewritten thus: UO* = £ or 0*U = £, whence it is 
seen that the unitarity of matrix U signifies the orthonormality of the set of 
its rows and the set of its columns in the sense of the scalar product (25). 

From (35) it follows that unitary matrices constitute a group, indeed: 

(1) if the matrix U is unitary, then U~' is also unitary: 

(t/-')~' = t/=(7n-=aP)‘ 

( 2 ) if the matrices f/, and U 2 are unitary, then U,Ui is unitary: 

(U,U ,)-' = 'UT' = Z/Jfj; = ^)* 

Comparing (31) and (35), we see that all unitary matrices (these and no 
others) specify unitary transformations in orthonormal bases. From this it 
also follows that unitary nX n matrices also form a group, and this group is 
isomorphic to the unitary group of n-dimensional space. 

18. THEOREM 5. The orthogonal group of n-dimensional Euclidean space 
is isomorphic to a subgroup of the unitary group of n-dimensional unitary spa¬ 
ce, which group in turn is isomorphic to a subgroup of the rotation group of 
2n-dimensional Euclidean space, which subgroup coincides with the whole rota¬ 
tion group in the single special case n = 1, 

PfoOF. The first assertion of the theorem follows directly from the fact 
that orthogonal matrices are a special case of unitary matrices: a matrix is 
orthogonal if and only if it is unitary and real. We note in passing that for any 
n the orthogonal group does not exhaust the whole unitary group, since there 
are unitary but not orthogonal n X n matrices. Such for instance is any diago¬ 
nal matrix in which |a;j| = 1 for all / but at least one of the numbers Ojj is 
nonreal. 

For «= 1 the unitary matrix U has a unique element On with |aii| = I, 
since in this case the matrix equation UO* — E reduces to the numerical equa¬ 
tion aiidn = I. The space here is the plane of a complex variable with scalar 
product (X, y) = xy. For an orthonormal basis we can take the number e = I 
on the plane. The transformation specified in this basis by the matrix U = 
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= llaiill = llcos q)+< sin (pll is nothing but the rotation oi the complex plane 
through the angle (p. Since the angle here can be arbitrary, we established that 
when n = 1, the unitary group actually coincides with the rotation group of 
a two-dimensional Euclidean plane. 

Furthermore, assuming n > 2 we take advantage of the construction given 
in Subsection 11. This construction permits regarding one and the same space 
as a unitary space C„ and as a Euclidean space Em- At the same time it per¬ 
mits viewing each unitary transformation in C„ as an isometric linear trans¬ 
formation in Em- Let U be a unitary transformation in C„. By Theorem 4 there 
is an orthonormal basis Si, .... In made up of eigenvectors, and the eigenva¬ 
lues are of the form cos qpo -f- i sin (px, fe = 1. n. Set ?* = ien. Lh = 

= L(Si„h) ^ E 2 n. The transformation U, when regarded in Ejn, operates in 



this rotation group has elements that cannot be regarded as unitary or even as 
linear transformations in €„■ Such for example is the rotation B that carries 
the basis vectors Cj, /i, e^, k, ..., e„, l„ into the vectors eu e 2 , — l\, h, ■■■, e„. 
It,, respectively, which is to say that it rotates the plane L(li,e 2 ) through the 

and leaves fixed the remaining vectors of the basis ei, ..., In- 

True enough, for in the complex space Cn the linearity of the transforma¬ 
tion A and the condition Aei = e, should imply Ali = A(iei) = iAei = /| but 
by no means Ali = ej. The proof of Theorem 5 is complete. 

REMARK. The operation of rotation B when n = 2 is shown schematically 
in Fig. 125. 

19. Let L be an n-dimensional complex space. As in Subsections 6-9, Section 
6 , Chapter Vlll, we establish the following. 

(1) If a unitary metric is introduced in L, then the collection of all ortho¬ 
normal bases in the metric forms a class of bases relative to the unitary group. 

(2) For every class of bases relative to the unitary group we can indicate 
a unitary metric in which the bases of this class are orthonormal. 

(3) If the unitary metric has been chosen, then the transformation of coor¬ 
dinates when passing from one orthonormal basis to another (also orthonor¬ 
mal) basis is specified by a unitary matrix. 
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20. As in Sections 4-5 ol Chapter IX, proof •an be given that a Hermitian 
form can be reduced to canonical form (13) by a transformation of variables 
with the unitary matrix, and _that a pair of Hermitian forms can be brought to 
canonical form via a nonsingular transformation of the variables if at least 
one of them is positive definite. 

REMARK. Let A be the matrix of the llerniitian form f in the initial basis 
e. To bring the form t^ canonical form we need an auxiliary Hermitian trans¬ 
formation with matrix A = A' (in the same basis c). l et <? = l‘e be an ortho¬ 
normal basis of eigenvectors of this transformation. Relative to the basis e. the 
matrix A of the transformation at hand is diagonal and real with A — 
= QAQ-' = QAP-' according to Section 2, Chapter VII. When passing from 
basis e to basis e, the matrix of the form / transforms via (12); A' = PAP'. 
Since the matrix P is unitary, P~' = P‘, we have 

Q = (P-‘)* = P, P = Q (36) 

Taking into account the foregoing, we see that 

A' = PAP- = QAP' — (QAP') = ( 1 ) = A (37) 


so that the form / is of canonical form in the basis e. 

EXAMPLE. Let the Hermitian form j in the, orthonormal basis Cj, ei be gi¬ 
ven by (14). that is, _ _ 

f = (l -t-3r)x'x»4-(1 -3l)x‘x' 


It is required to bring it to canonical form while remaining in the class of 
orthonormal bases. For the auxiliary Hermitian transformation with the matrix 


A' 


0 1-3/ 

1 + 3/ 0 

we find the characteristic polynomial det (A*—XE) = det {A — XE) — X^ — 
— 10 , its roots Xi, 2 = ± Vlf^. and the eigenvectors Ci and St with the eigen¬ 
values Xi and X 2 , respectively: 


Cl = 


02 - 


1 


3/ 

— 01 


2 Vs 

I 


+ 


V 2 


ei + 


V 2 - 

1 +3/ 

- 7 ^^ ^2 

2 Vs 


(38) 


The vectors (38) have already been normalized: ||eil| = ||e 2 ll = 1; they consti¬ 
tute the desired basis in which 


/ = y' +X 2 y^y^ = ^/\0y' //' - VlOy+v^ 


Taking into account (36), it is easy to write down the transformation of coor¬ 
dinates: 


y' 

y^ 


1 +3/ 

2 Vs" 


JC' + 


--prX' + 

V 2 


I 

ViT 

I - 


2 V 


3 / , 

^ Jc’ 


21. We conclude this section with the statement of an important theorem 
that may be proved by analogy with Section II of Chapter IX. 

THEOREM 6. For any nonsingular lin 0 ar transformation A w n-ctim 0 nsio- 
nal unitary spac 0 th 0 r 0 is a self-adjoint transformation H and a unitary trans¬ 
formation U such that A = U/i, 
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